C#编程技巧:读取Word的方法

ZDNet软件频道 时间:2009-11-27 作者: | IT168网站 我要评论()
本文关键词:ORACLE
【IT168 技术文档】首先添加引用,解决方案资源管理器-》引用-》添加-》Com-》浏览-》C:Program FilesMicrosoft OfficeOFFICE11MSWORD.OLB 我使用的是office 2003其他版本我不太清楚,.net会自动把OLB控件转换成DLL文件 //从网页上拷贝的目录有时候会出现手动换行符^l,,先将其换成回车段落标记,才能正确读取 

  【IT168 技术文档】首先添加引用,解决方案资源管理器-》引用-》添加-》Com-》浏览-》C:Program FilesMicrosoft OfficeOFFICE11MSWORD.OLB 我使用的是office 2003其他版本我不太清楚,.net会自动把OLB控件转换成DLL文件

  使用方法:

  objectoMissing =System.Reflection.Missing.Value;

  Word.Application oWord =newWord.Application();

  oWord.Visible =false//设置Word应用程序为不可见

  //新建一个Word文档

  Word.Document oDoc=oWord.Documents.Add(refoMissing,refoMissing ,refoMissing,refoMissing);

  //文档内容的复制与粘贴

  oDoc.Content.Copy();

  oDoc.Content.Paste()

  //文档的另存为

  oDoc.SaveAs(reffileName,refsaveFormat,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing,refoMissing);

  //其他设置

  oDoc.PageSetup.PaperSize=Word.WdPaperSize.wdPaperA3;//页面设置

  oDoc.PageSetup.Orientation=Word.WdOrientation.wdOrientLandscape;//横板还是竖板

  oDoc.PageSetup.TextColumns.SetCount(2);//分栏

  //关闭Word

  oWord.Application.Quit(refb,refoMissing,refoMissing);

  System.Runtime.InteropServices.Marshal.ReleaseComObject(oWord);

  通过oDoc对象对Word文档进行操作(word能做的它都能做)进行操作里面有很多函数,有兴趣的自己研究

  另一方法:

  1:对项目添加引用,Microsoft Word 11.0 Object Library

  2:在程序中添加 using Word = Microsoft.Office.Interop.Word;

  3:程序中添加

  Word.Application app =newMicrosoft.Office.Interop.Word.Application(); //可以打开word程序

  Word.Document doc =null//一会要记录word打开的文档

  word文档和word程序可不是一回事奥!

  4:一般来说,对于抽取word内容,用的方法很少

  publicoverridevoidopenFile(objectfileName){} //打开文档

  publicoverrideobjectreadPar(inti){} //读取word文档的第i段

  publicoverrideintgetParCount(){} //返回word文档一共几段

  publicoverridevoidcloseFile(){} //关闭文档

  publicoverridevoidquit(){} //关闭word程序

  //从网页上拷贝的目录有时候会出现手动换行符^l,,先将其换成回车段落标记,才能正确读取

  publicvoidreplaceChar(){}

  5:代码

  publicoverridevoidopenFile(objectfileName)

  ...{

  try  ...{

  if(app.Documents.Count >0)

  ...{

  if(MessageBox.Show("已经打开了一个word文档,你想关闭重新打开该文档吗?", "提示", MessageBoxButtons.YesNo) ==DialogResult.Yes)

  ...{

  objectunknow =Type.Missing;

  doc =app.ActiveDocument;

  if(MessageBox.Show("你想保存吗?", "保存", MessageBoxButtons.YesNo) ==DialogResult.Yes)

  ...{

  app.ActiveDocument.Save();

  }

  app.ActiveDocument.Close(refunknow, refunknow, refunknow);

  app.Visible =false  }

  else  ...{

  return  }

  }

  }

  catch(Exception)

  ...{

  //MessageBox.Show("您可能关闭了文档");

  app =newMicrosoft.Office.Interop.Word.Application();

  }

  try  ...{

  objectunknow =Type.Missing;

  app.Visible =true  doc =app.Documents.Open(reffileName,

  refunknow, refunknow, refunknow, refunknow, refunknow,

  refunknow, refunknow, refunknow, refunknow, refunknow,

  refunknow, refunknow, refunknow, refunknow, refunknow);

  }

  catch(Exception ex)

  ...{

  MessageBox.Show("出现错误:"+ex.ToString());

  }

  }

  publicoverrideobjectreadPar(inti)

  ...{

  try  ...{

  stringtemp =doc.Paragraphs[i].Range.Text.Trim();

  returntemp;

  }

  catch(Exception e) ...{

  MessageBox.Show("Error:"+e.ToString());

  returnnull  }

  }

  publicoverrideintgetParCount()

  ...{

  returndoc.Paragraphs.Count;

  }

  publicoverridevoidcloseFile()

  ...{

  try  ...{

  objectunknow =Type.Missing;

  objectsaveChanges =Word.WdSaveOptions.wdPromptToSaveChanges;

  app.ActiveDocument.Close(refsaveChanges, refunknow, refunknow);

  }

  catch(Exception ex)

  ...{

  MessageBox.Show("Error:"+ex.ToString());

  }

  }

  publicoverridevoidquit()

  ...{

  try  ...{

  objectunknow =Type.Missing;

  objectsaveChanges =Word.WdSaveOptions.wdSaveChanges;

  app.Quit(refsaveChanges, refunknow, refunknow);

  }

  catch(Exception)

  ...{

  }

  }

  publicvoidreplaceChar() ...{

  try  ...{

  objectreplaceAll =Word.WdReplace.wdReplaceAll;

  objectmissing =Type.Missing;

  app.Selection.Find.ClearFormatting();

  app.Selection.Find.Text ="^l"  app.Selection.Find.Replacement.ClearFormatting();

  app.Selection.Find.Replacement.Text ="^p"  app.Selection.Find.Execute(

  refmissing, refmissing, refmissing, refmissing, refmissing,

  refmissing, refmissing, refmissing, refmissing, refmissing,

  refreplaceAll, refmissing, refmissing, refmissing, refmissing);

  }

  catch(Exception e)

  ...{

  MessageBox.Show("文档出现错误,请重新操作");

  }

  }

  6:刚才是用读取一段做的例子,如果要读取一句或一篇只需要把doc.Paragraphs[i](readPar中)改成doc.Sentences[i]或doc.content即可,因为都是微软的东东,所以用起来没有一点的障碍,再加上现在的vs2005做的很智能,所以先从java转到了c#上

  7:实际上,c#中读取word是不用那么麻烦的,但是如果考虑到可能还要抽取txt,ppt等多种格式,所以就写了一个抽象类,调用起来也方便,这就是为什么我的程序方法开头会有override的原因,总要考虑到通用,所以多了一些代码。


百度大联盟认证黄金会员Copyright© 1997- CNET Networks 版权所有。 ZDNet 是CNET Networks公司注册服务商标。
中华人民共和国电信与信息服务业务经营许可证编号:京ICP证010391号 京ICP备09041801号-159
京公网安备:1101082134