본문 바로가기

728x90

SMALL

afinn1

[Data Science] 문서의 행렬 표현 (DTM and TF-IDF) CountVectorizer를 이용한 토큰화 import sklearn print(sklearn.__version__) from sklearn.feature_extraction.text import CountVectorizer vector = CountVectorizer() text = ['Text mining, also referred to as text data mining, similar to text analytics, is the process of deriving high-quality information from text.'] vector.fit_transform(text).toarray() array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 4,.. 2022. 9. 29.

이전 1 다음

728x90

LIST

티스토리툴바