Vector Space Modeling (20%)
handle large text collections, using data streaming and efficient incremental algorithms, which differentiates it from most other scientific software packages that only target batch and in-memory proc
Gensim
REF:




sklearn features extract
REF
Gensim Fasttext pre-trained model get vectors for out-of-vocabulary words
out-of-vocab words are represented as the sum of character ngram vectors. While the intent is to handle out-of-vocab words (unks) like "blargfizzle", it also handles phrases like your input.
BERT
文本相似度
Getting started with Word2Vec
Word2vec Made Easy
全面擁抱Transformer:NLP三大特徵抽取器(CNN/RNN/TF)比較
https://blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/86446077
five most popular Similarity Measures in Python
Last updated