⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 sina_weather_txt.py

📁 python 爬虫 抓取
💻 PY
字号:
# -*- coding:cp936 -*-
import urllib2,re,os

cityname = raw_input("input the city name:") 
url ="http://php.weather.sina.com.cn/search.php?city="
sock = urllib2.urlopen(url+cityname)
htmlone = sock.read()
htmlre = re.compile(r"""<!-- main begin -->\s*<div\s*class="main">\s*<div\s*class="text">\s*<h3>.*</h3>\s*<p>.*</p>\s*<p>.*<span><a\s*href="(.*)">.*</a></span></p>""")
htmlrs = htmlre.findall(htmlone)
sock2 = urllib2.urlopen(htmlrs[0])
htmlsource= sock2.read()
#print htmlsource
htmlsource = htmlsource.replace("&nbsp;"," ")
htmlsource = htmlsource.replace("星期","周")
htmlsource = htmlsource.replace("2009-","")
htmlsource = htmlsource.replace("<span>","")
htmlsource = htmlsource.replace("</span>","")
htmlsource = htmlsource.replace("&deg;C","℃")
p = re.compile(r"""<div\s*class="City_Data">\s*<h3>(.*)</h3>\s*
                   <p>(.*)</p>\s*</div>\s*<div\s*class="Weather_Icon_B"><img.*>\s*</div>\s*
                   <div\s*class="Weather_TP">(.*)</div>\s*<div\s*class="Weather_W">(.*)</div>\s*</div>\s*
                    """,re.I|re.X)
q = re.compile(r"""(\d{4}-\d{2}-\d{2}\s*\d{2}:\d{2}:\d{2})""")

rs = p.findall(htmlsource)
#rt = q.findall(htmlsource)
#print """发布时间:""", rt[0]
print rs[0][0],rs[0][1],rs[0][2],rs[0][3]
print rs[1][0],rs[1][1],rs[1][2],rs[1][3]
print rs[2][0],rs[2][1],rs[2][2],rs[2][3]
sock.close()
os.system("pause")

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -