What is Term Frequency (TF)? 

A Term Frequency matrix consists of the IDs for the documents in the corpus for the rows and all of the words in the vocabulary in the columns. A given entry in a TF matrix is interpreted as the number of occurrences of word w in document d. If the value is 0, that word does not appear in document d. In a large corpus, there will likely be many words as part of the vocabulary, so this is usually a large sparse matrix. 

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute
Here goes your text ... Select any part of your text to access the formatting toolbar.