as.data.frame.udpipe_connlu | Convert the result of udpipe_annotate to a tidy data frame |
as_phrasemachine | Convert Parts of Speech tags to one-letter tags which can be used to identify phrases based on regular expressions |
brussels_listings | Brussels AirBnB address locations available at www.insideairbnb.com |
brussels_reviews | Reviews of AirBnB customers on Brussels address locations available at www.insideairbnb.com |
brussels_reviews_anno | Reviews of the AirBnB customers which are tokenised, POS tagged and lemmatised |
collocation | Extract collocations - a sequence of terms which follow each other |
cooccurrence | Create a cooccurence data.frame |
cooccurrence.character | Create a cooccurence data.frame |
cooccurrence.cooccurrence | Create a cooccurence data.frame |
cooccurrence.data.frame | Create a cooccurence data.frame |
document_term_frequencies | Aggregate a data.frame to the document/term level by calculating how many times a term occurs per document |
document_term_frequencies.character | Aggregate a data.frame to the document/term level by calculating how many times a term occurs per document |
document_term_frequencies.data.frame | Aggregate a data.frame to the document/term level by calculating how many times a term occurs per document |
document_term_matrix | Create a document/term matrix from a data.frame with 1 row per document/term |
document_term_matrix.data.frame | Create a document/term matrix from a data.frame with 1 row per document/term |
document_term_matrix.DocumentTermMatrix | Create a document/term matrix from a data.frame with 1 row per document/term |
document_term_matrix.simple_triplet_matrix | Create a document/term matrix from a data.frame with 1 row per document/term |
document_term_matrix.TermDocumentMatrix | Create a document/term matrix from a data.frame with 1 row per document/term |
dtm_cor | Pearson Correlation for Sparse Matrices |
dtm_remove_lowfreq | Remove terms occurring with low frequency from a Document-Term-Matrix and documents with no terms |
dtm_remove_terms | Remove terms from a Document-Term-Matrix and keep only documents which have a least some terms |
dtm_remove_tfidf | Remove terms from a Document-Term-Matrix and documents with no terms based on the term frequency inverse document frequency |
dtm_reverse | Inverse operation of the document_term_matrix function |
dtm_tfidf | Term Frequency - Inverse Document Frequency calculation |
phrases | Extract phrases - a sequence of terms which follow each other based on a sequence of Parts of Speech tags |
predict.LDA_Gibbs | Predict method for an object of class LDA_VEM or class LDA_Gibbs |
predict.LDA_VEM | Predict method for an object of class LDA_VEM or class LDA_Gibbs |
txt_collapse | Collapse a character vector while removing missing data. |
txt_freq | Frequency statistics of elements in a vector |
txt_highlight | Highlight words in a character vector |
txt_next | Get the n-th next element of a vector |
txt_nextgram | Based on a vector with a word sequence, get n-grams |
txt_previous | Get the n-th previous element of a vector |
txt_recode | Recode text to other categories |
txt_sample | Boilerplate function to sample one element from a vector. |
txt_show | Boilerplate function to cat only 1 element of a character vector. |
udpipe_annotate | Tokenise, Tag and Dependency Parsing Annotation of raw text |
udpipe_annotation_params | List with training options set by the UDPipe community when building models based on the Universal Dependencies data |
udpipe_download_model | Download an UDPipe model provided by the UDPipe community for a specific language of choice |
udpipe_load_model | Load an UDPipe model |
udpipe_train | Train a UDPipe model |
unique_identifier | Create a unique identifier for each combination of fields in a data frame |