cleanNLP-package | cleanNLP: A Tidy Data Model for Natural Language Processing |
cleanNLP | cleanNLP: A Tidy Data Model for Natural Language Processing |
combine_documents | Combine a set of annotations |
dep_frequency | Universal Dependency Frequencies |
doc_id_reset | Reset document ids |
download_core_nlp | Download java files needed for CoreNLP |
extract_documents | Extract documents from an annotation object |
from_CoNLL | Reads a CoNLL-U or CoNLL-X File |
get_combine | One Table Summary of an Annotation Object |
get_coreference | Access coreferences from an annotation object |
get_dependency | Access dependencies from an annotation object |
get_document | Access document meta data from an annotation object |
get_entity | Access named entities from an annotation object |
get_sentence | Access sentence-level annotations |
get_tfidf | Construct the TF-IDF Matrix from Annotation or Data Frame |
get_token | Access tokens from an annotation object |
get_vector | Access word embedding vector from an annotation object |
init_coreNLP | Interface for initializing the coreNLP backend |
init_spaCy | Interface for initializing up the spaCy backend |
init_tokenizers | Interface for initializing the tokenizers backend |
obama | Annotation of Barack Obama's State of the Union Addresses |
pos_frequency | Universal Part of Speech Code Frequencies |
print.annotation | Print a summary of an annotation object |
read_annotation | Read annotation files from disk |
run_annotators | Run the annotation pipeline on a set of documents |
tidy_pca | Compute Principal Components and store as a Data Frame |
to_CoNNL | Returns a CoNLL-U Document |
word_frequency | Most frequent English words |
write_annotation | Write annotation files to disk |