⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 bookschinaparser.java

📁 本系统实现了从五个网站上搜索的图书进行整合后
💻 JAVA
📖 第 1 页 / 共 2 页
字号:
		String bookName ="";
		/* 取出<a>结点 */
		if(bookElement.hasChildNodes()
				&&bookElement.getFirstChild().getNodeType() == Node.ELEMENT_NODE){
			Element firstElement = (Element) bookElement.getFirstChild();
			NodeList nameList = firstElement.getChildNodes();
			/* 循环遍历<a>的子结点,并取出其中的文本值 */
			for (int i = 0; i < nameList.getLength(); i++) {
	
				Node nameNode = nameList.item(i);
				/* 分元素结点和#text结点两种不同情况进行处理 */
				if (nameNode.getNodeType() == Node.ELEMENT_NODE) {
					/* 元素结点取得其中的文本值 */
					Element nameElement = (Element) nameNode;
					if(nameElement.hasChildNodes())
					    bookName += nameElement.getFirstChild().getNodeValue();
				} else {
					/* #text结点取得其中的文本值 */
					bookName += nameNode.getNodeValue();
				}
	
			}
		}
		if(bookName.length()>0)
			bookName = bookName.replace(" ", "");
//		System.out.println(bookName);
		return bookName;
	}

	/**
	 * Function: getBookAuthor 
	 * Description: 获得图书作者 
	 * Calls: no 
	 * Called By:mainService
	 * @param bookElement as Element
	 * @return String
	 * @throws no
	 */
	public String getBookAuthor(Element bookElement) {

		String bookAuthor = "";
		NodeList authorList = bookElement.getChildNodes();
		for (int i = 0; i < authorList.getLength(); i++) {

			Node nameNode = authorList.item(i);
    		if (nameNode.getNodeType() == Node.ELEMENT_NODE
    				&& "A".equals(nameNode.getNodeName())) {
    			Element nameElement = (Element)nameNode;
    			if(nameElement.hasChildNodes())
    			    bookAuthor += nameElement.getFirstChild().getNodeValue();
    		}else{
    			bookAuthor += nameNode.getNodeValue();
    		}
		}
		if(bookAuthor.indexOf("作者:") != -1)
		  bookAuthor = bookAuthor.substring(bookAuthor.indexOf("作者:")+"作者:".length());
		bookAuthor=bookAuthor.replaceAll(" ", "").trim();
		
		if(bookAuthor.length()>64)
			bookAuthor=bookAuthor.substring(0, 64);
        bookAuthor = bookAuthor.replace(",", " ");
        bookAuthor = bookAuthor.replace(",", " ");
        bookAuthor = bookAuthor.replace("等", "");
		//System.out.println("bookAuthor:"+bookAuthor);
		return bookAuthor;
	}

	/**
	 * Function: getBookPublisher 
	 * Description: 获得图书出版社 
	 * Calls: no 
	 * Called By:mainService
	 * @param bookElement as Element
	 * @return String
	 * @throws no
	 */
	public String getBookPublisher(Element bookElement) {

		String bookPublisher = null;
        if(null != bookElement.getTextContent()&&!"".equals(bookElement.getTextContent()))
	        bookPublisher=bookElement.getTextContent().trim();
		bookPublisher = bookPublisher.replace("出版社:", "");
		//System.out.println(bookPublisher);
		return bookPublisher;
	}

	/**
	 * Function: getBookPublishTime 
	 * Description: 获得图书出版时间 
	 * Calls: no 
	 * Called By:mainService
	 * @param bookElement as Element
	 * @return String
	 * @throws no
	 */
	public String getBookPublishTime(Element bookElement) {
		String bookPublishTime =null;
		if(bookElement.hasChildNodes())
	        bookPublishTime = bookElement.getFirstChild().getNodeValue();
		if(null != bookPublishTime)
			bookPublishTime = bookPublishTime.trim();
		if(bookPublishTime.length()>=15)
		    bookPublishTime = bookPublishTime.substring(0, 15);
		if(null != bookPublishTime){
			if(bookPublishTime.indexOf("出版日期:") != -1){
				bookPublishTime = bookPublishTime.substring((bookPublishTime.indexOf("出版日期:")+"出版日期:".length()));
			}
			if(bookPublishTime.indexOf("ISBN:") != -1){
				bookPublishTime = bookPublishTime.substring(0,bookPublishTime.indexOf("ISBN:"));
			}
			bookPublishTime = bookPublishTime.trim();
		}
	
		//System.out.println(bookPublishTime);
		return bookPublishTime;
	}

	/**
	 * Function: getBookISBN 
	 * Description: 获得图书ISBN 
	 * Calls: no 
	 * Called By: mainService
	 * @param bookElement as Element
	 * @return String
	 * @throws no
	 */
	public String getBookISBN(Element bookElement) {
		
		String bookISBN =null;
		if(bookElement.hasChildNodes())
		    bookISBN = bookElement.getFirstChild().getNodeValue();	
		if(bookISBN.indexOf("ISBN:")!= -1 && bookISBN.length()>bookISBN.indexOf("ISBN:") + 5)
		    bookISBN = bookISBN.substring(bookISBN.indexOf("ISBN:") + 5);

		//System.out.println(bookISBN);
		return bookISBN;
	}

	/**
	 * Function: getBookPrice 
	 * Description: 获得图书网站价格 
	 * Calls: no 
	 * Called By:mainService
	 * @param bookElement as Element
	 * @return String
	 * @throws no
	 */
	public String getBookPrice(Element bookElement) {

		String bookPrice = "";

		NodeList authorList = bookElement.getChildNodes();
		for (int i = 0; i < authorList.getLength(); i++) {
            
			Node nameNode = authorList.item(i);			  	
    		if (nameNode.getNodeType() == Node.ELEMENT_NODE
    				&& "SPAN".equals(nameNode.getNodeName())) {
    			Element nameElement = (Element)nameNode;
    			if(nameElement.hasChildNodes())
    			    bookPrice = nameElement.getFirstChild().getNodeValue();
    			break;
    		}

		}
		bookPrice = bookPrice.replace("¥", "");
		bookPrice = bookPrice.replace(",", "");
		bookPrice = bookPrice.replace(",", "");
//		if(bookPrice.length()>3)
//			bookPrice = bookPrice.substring(2);
		if(bookPrice.length()>0)
			bookPrice = bookPrice.trim();
		//System.out.println(bookPrice);
		return bookPrice;
	}

	/**
	 * Function: getBookDiscount 
	 * Description: 获得图书折扣 
	 * Calls: no 
	 * Called By:mainService 
	 * @param bookElement as Element
	 * @return String
	 * @throws no
	 */
	public String getBookDiscount(Element bookElement) {
		
               return null;
	}

	/**
	 * Function: getBookFixPrice 
	 * Description: 获得图书定价 
	 * Calls: no 
	 * Called By:mainService
	 * @param bookElement as Element
	 * @return String
	 * @throws no
	 */
	public String getBookFixPrice(Element bookElement) {
		String bookFixPrice = "";
		
		NodeList authorList = bookElement.getChildNodes();
		for (int i = 0; i < authorList.getLength(); i++) {

			Node nameNode = authorList.item(i);
    		if (nameNode.getNodeType() == Node.ELEMENT_NODE) {
    			Element nameElement = (Element)nameNode;
    	        if("price".equals(nameElement.getAttribute("class")))
    			    if(nameElement.hasChildNodes())
    			        bookFixPrice = nameElement.getFirstChild().getNodeValue();
    		}
		}
		bookFixPrice = bookFixPrice.replace("¥", "");
		bookFixPrice = bookFixPrice.replace(",", "");
		bookFixPrice = bookFixPrice.replace(",", "");
		if(bookFixPrice.length()>0)
			bookFixPrice = bookFixPrice.replace(" ", "");
//		System.out.println(bookFixPrice);
		return bookFixPrice;
	}

	/**
	 * Function: getBookUrl 
	 * Description: 获得图书详细信息地址 
	 * Calls: no 
	 * Called By:mainService
	 * @param bookElement as Element
	 * @return String
	 * @throws no
	 */
	public String getBookUrl(Element bookElement) {
		String bookUrl = "";
		Element firstElement = null;
		if(bookElement.hasChildNodes()){
			if(bookElement.getFirstChild().getNodeType() == Node.ELEMENT_NODE){
				firstElement  = (Element) bookElement.getFirstChild();
			    if(null != firstElement.getAttribute("href"))
		            bookUrl = "http://www.bookschina.com"
				            + firstElement.getAttribute("href").trim();	
			}
		}
		//System.out.println(bookUrl);
		return bookUrl;
	}
	public String getBookContent(Element bookElement) {
		// TODO Auto-generated method stub
		return null;
	}
	/**
	 * Function: getNextPageUrl 
	 * Description: 获得下一页超链接地址 
	 * Calls: no 
	 * Called By: no
	 * @param doc as Document
	 * @return String
	 * @throws no
	 */
	public String getNextPageUrl(Document doc) {
		/* 初始化为no,表示没有下一页 */
		String nextpageUrl = "no";

		NodeList divList = doc.getElementsByTagName("div");
		for (int i = 0; i < divList.getLength(); i++) {
			if(divList.item(i).getNodeType() == Node.ELEMENT_NODE){
				Element serveritem = (Element) divList.item(i);
				/* 过滤出分页工具栏标签<div id="divBottomPageNavi"...> */
				if ("float: right;margin:5px 5px 0px 0px;".equals(serveritem
						.getAttribute("style"))) {
					NodeList chList = serveritem.getChildNodes();
	                if(chList.getLength()>=5){
						Element sElement = (Element) chList.item(4);
		                if(null != sElement.getAttribute("href")
		                		&& !"".equals(sElement.getAttribute("href"))){
		                	
							nextpageUrl = "http://www.bookschina.com"
								        + sElement.getAttribute("href").trim();
							break;
		                }
	                }
				}
			}
		}
		//System.out.println(nextpageUrl);
		return nextpageUrl;
	}

	public long getRecordNum(Document doc) {

		/* 初始化为0 */
		long num = 0;
		NodeList servers = doc.getElementsByTagName("div");
		for (int i = 0; i < servers.getLength(); i++) {
            if(servers.item(i).getNodeType()==Node.ELEMENT_NODE){
				Element serveritem = (Element) servers.item(i);
				if ("keyword".equals(serveritem.getAttribute("class"))) {
	
					NodeList childList1 = serveritem.getChildNodes();
					for(int j = 0;j<childList1.getLength();j++){
						Node spanNode = childList1.item(j);
						if(spanNode.getNodeType()== Node.ELEMENT_NODE){
							Element spanElement = (Element) spanNode;
							if("topic2".equals(spanElement.getAttribute("class"))){
								if(null != spanElement.getFirstChild().getNodeValue())
								    num = Long.valueOf(spanElement.getFirstChild().getNodeValue().trim());
								break;
							}
						}
					}
				}
            }
		}
		//System.out.println(num);
		return num;
	}

	public static void main(String args[]) throws Exception {
		//System.out.println("作 者:".length());
		BookschinaParser bookChina = new BookschinaParser();
		Document doc = bookChina
				.nekohtmlParser("http://www.bookschina.com/book_find/goodsfind.aspx?book=java");
		//Price price = bookChina.getDetailInfo(doc);
		//System.out.println(price.getBookschinaDiscount() + ">>" + price.getBookschinaPrice() + price.getBookschinaUrl());
		ArrayList<Book> list=bookChina.mainService(doc,true);
		Iterator it=list.iterator();
		while(it.hasNext())
		{
			Book book1=(Book)it.next();
			System.out.println(book1.getBookName()+">>" + book1.getBookAuthor() + ">>" + book1.getBookImage() +">>"+book1.getBookFixPrice()+">>"+book1.getBookISBN()+">>"+book1.getBookPublishTime()+">>"+book1.getBookPublisher());
			Price price = book1.getPrice();
			System.out.println(price.getBookschinaDiscount()+">>"+price.getBookschinaPrice()+">>" + price.getBookschinaUrl());
		}
		System.out.println(bookChina.getNextPageUrl(doc));
		// System.out.println(dangDang.getNextPageUrl(doc));
		System.out.println(bookChina.getRecordNum(doc));
	}



}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -