制作网站用什么软件好,装修在线设计平台,网店买卖有哪些平台,成品在线网站免费入口用途#xff1a;个人学习笔记#xff0c;有所借鉴#xff0c;欢迎指正
前言#xff1a; 主要包含对requests库和Web爬虫解析库的使用#xff0c;python爬虫自动化#xff0c;批量信息收集 Python开发工具#xff1a;PyCharm 2022.1 激活破解码_安装教程 (2022年8月25日…用途个人学习笔记有所借鉴欢迎指正
前言 主要包含对requests库和Web爬虫解析库的使用python爬虫自动化批量信息收集 Python开发工具PyCharm 2022.1 激活破解码_安装教程 (2022年8月25日更新~)-小白学堂 一、EDUSRC平台爬取接收漏洞的教育机构名称
1、爬取目标EDUSRC平台此网页第1-209页的所有教育机构名称保存到txt文件中
2、Python代码实现
import requests,time
from bs4 import BeautifulSoup#tr
# td classam-text-center1/td
# td classam-text-center
# a href/list/firm/3761上海交通大学/a
# /td
# td classam-text-center3994/td
# td classam-text-center10523/td
#/trdef get_eduName():for i in range(1,209):url https://src.sjtu.edu.cn/rank/firm/0/?page%s%str(i)try:srequests.get(url).textprint(-------正在获取第%s页面数据%str(i))soup BeautifulSoup(s, lxml)edu1soup.find_all(tr)for edu in edu1:edu_nameedu.a.stringprint(edu_name)with open(eduname.txt,a,encodingutf-8) as f:f.write(edu_name\n)f.close()except Exception as e:time.sleep(1)passif __name__ __main__:get_eduName()
二、利用FOFA搜索引擎批量爬取与目标相关的URL地址
1、FOFA搜索语法收集目标名称相关所有URL地址 2、Python代码实现
import requests
from bs4 import BeautifulSoupheader{#登录fofa,浏览器查看数据包中的登录凭证fofa_tokencookie:fofa_tokeneyJhbGciOiJIUzUxMiIsImtpZCI6Ik5XWTVZakF4TVRkalltSTJNRFZsWXpRM05EWXdaakF3TURVMlkyWTNZemd3TUdRd1pUTmpZUT09IiwidHlwIjoiSldUIn0.eyJpZCI6MjgyNzMsIm1pZCI6MTAwMDIxOTg4LCJ1c2VybmFtZSI6InhpYW9kaXNlYyIsImV4cCI6MTY3MTI4MjUzOH0.0ukMGFIrIvzDOzpUl9JglOoMpzbIPCczGRDeqKdmFYHfStd2jdwc6LGby3Ke0UR2suvErzhOTPYL2ACe4Goi8Q;
}urlhttps://fofa.info/result?qbase64dGl0bGU9IuS4iua1t%2BS6pOmAmuWkp%2BWtpiIgJiYgY291bnRyeT0iQ04i
srequests.get(url,headersheader).text
soup BeautifulSoup(s, lxml)
#获取页数
edu1soup.find_all(p,attrs{class: hsxa-nav-font-size})
for edu in edu1:edu_name edu.span.get_text()iint(edu_name)/10yeshuint(i)1print(yeshu)for ye in range(1,yeshu1):url https://fofa.info/result?qbase64dGl0bGU9IuS4iua1t%2BS6pOmAmuWkp%2BWtpiIgJiYgY291bnRyeT0iQ04ipagestr(ye)page_size10print(url)s requests.get(url, headersheader).textedu1soup.find_all(span,attrs{class: hsxa-host})for edu in edu1:edu_name edu.a.get_text().strip()print(edu_name)3、使用Goby新建扫描任务导入收集到的URL目标批量扫描漏洞
三、 使用FOFA查询接口批量查询收集URL
Python代码实现
import requests
import base64#https://fofa.info/api/v1/search/all?emailyour_emailkeyyour_keyqbase64dGl0bGU9ImJpbmcidef get_fofa_data(email,apikey):for eduname in open(eduname.txt,encodingutf-8):eeduname.strip()search%s countryCN titleError 404--Not Found%ebbase64.b64encode(search.encode(utf-8))bb.decode(utf-8)urlhttps://fofa.info/api/v1/search/all?email%skey%sqbase64%s%(email,apikey,b)srequests.get(url).json()print(查询-eduname)print(url)if s[size] ! 0:print(eduname有数据啦)for ip in s[results]:print(ip[0])else:print(没有数据)if __name__ __main__:email471656814qq.com #自己fofa账号apikey0fccc926c6d0c4922cbdc620659b9a42 #fofa个人中心的apikeyget_fofa_data(email,apikey)