[Web Crawler] 조회결과를 수집하고 txt 파일로 저장하기

728x90

SMALL

웹페이지 데이터 추출

# 필요한 모듈과 라이브러리를 로딩하고 검색어 입력
from bs4 import BeautifulSoup
from selenium import webdriver
import time
import sys

query_txt = input('크롤링할 키워드 입력: ')
f_name = input('검색 결과를 저장할 파일경로와 이름 지정(예 : /Users/test.txt)')

# 크롬 드라이버 사용해서 웹 브라우저 실행
path = "/Users/chromedriver"
driver = webdriver.Chrome(path)
driver.get("https://korean.visitkorea.or.kr/main/mian.html")
time.sleep(2) # 창이 모두 열릴 때까지 2초 기다림

# 검색창의 이름을 찾아서 검색어 입력
driver.find_element_by_id("btnSearch").click()

element = driver.find_element_by_id("inp_search")
element.send_keys(query_txt)

# 검색 버튼 눌러 실행
driver.find_element_by_link_text("검색").click()

# 현재 페이지에 있는 내용을 화면에 출력
time.sleep(1)

html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
blog_list = soup.find('ul', class_='list_thuType flnon')

for i in blog_list:
	print(i.text.strip())
    print("＼n")

텍스트 파일로 저장하기

# 현재 페이지에 있는 내용을 txt 형식으로 파일에 저장
orig_stdout = sys.stdout
f = open(f_name, 'a', encoding = 'UTF-8')
sys.stdout = f
time.sleep(1)

html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
blog_list = soup.find('ul', class_ = 'list_thumType flnon')

for i in blog_list:
	print(i.text.strip())
    print("\n")
    
sys.stdout = orig_stdout
f.close()

print("데이터 수집 완료")

728x90

LIST

저작자표시 비영리 변경금지 (새창열림)

'App Programming > Web Crawler' 카테고리의 다른 글

[Web Crawler] 특정 게시글의 상세 내용 수집하기 (0)	2022.02.16
[Web Crawler] 다양한 유형의 파일로 저장하기 (csv, xls, txt) (0)	2022.02.16
[Web Crawler] 셀레니움 (Selenium) (4) (0)	2022.02.15
[Web Crawler] 셀레니움 (Selenium) (3) (0)	2022.02.15
[Web Crawler] 셀레니움 (Selenium) (2) (0)	2022.02.15

GOATLAB

[Web Crawler] 조회결과를 수집하고 txt 파일로 저장하기

웹페이지 데이터 추출

텍스트 파일로 저장하기

'App Programming > Web Crawler' 카테고리의 다른 글

티스토리툴바

[Web Crawler] 조회결과를 수집하고 txt 파일로 저장하기

웹페이지 데이터 추출

텍스트 파일로 저장하기

'App Programming > Web Crawler' 카테고리의 다른 글

관련글

티스토리툴바