⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 earlytrectopicreader.java

📁 dragontoolkit用于机器学习
💻 JAVA
字号:
package dragon.onlinedb.trec;import dragon.onlinedb.*;import dragon.util.FileUtil;import java.util.ArrayList;/** * <p>TREC topic reader earlier than 2004 </p> * <p></p> * <p>Copyright: Copyright (c) 2005</p> * <p>Company: IST, Drexel University</p> * @author Davis Zhou * @version 1.0 */public class EarlyTrecTopicReader extends AbstractTopicReader{    public EarlyTrecTopicReader(String topicFile){        super(topicFile);    }    protected ArrayList loadTopics(String topicFile){        ArrayList list;        String content;        int start, end;        try {            content=FileUtil.readTextFile(topicFile);            list=new ArrayList();            start=0;            end=content.indexOf("</top>");            while(end>0){                end=end+6;                list.add(parseEarlyTrecTopic(content.substring(start,end)));                start=end;                end=content.indexOf("</top>",start);            }            return list;        }        catch (Exception e) {            e.printStackTrace();            return null;        }    }    private Article parseEarlyTrecTopic(String content){        BasicArticle article;        String section;        int start;        article=new BasicArticle();        section=getTopicSection("num",content);        start=section.indexOf(":")+1;        article.setKey(section.substring(start).trim());        article.setCategory(Integer.parseInt(article.getKey()));        section=getTopicSection("title",content);        if(section!=null){            start = section.indexOf(":") + 1;            article.setTitle(section.substring(start).trim());        }        section=getTopicSection("desc",content);        if(section!=null){            start = section.indexOf(":") + 1;            article.setAbstract(section.substring(start).trim());        }        section=getTopicSection("narr",content);        if(section!=null){            start = section.indexOf(":") + 1;            article.setBody(section.substring(start).trim());        }        section=getTopicSection("con",content);        if(section!=null){            start = section.indexOf(":") + 1;            article.setMeta(section.substring(start).trim());        }        return article;    }    private String getTopicSection(String sectionName, String topic){        int start, end;        start=topic.indexOf("<"+sectionName+">");        if(start<0) return null;        start+=sectionName.length()+2;        end=topic.indexOf("\n<",start);        topic=topic.substring(start,end);        return topic.replace('\n',' ').trim();    }}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -