A Tidy Data Model for Natural Language Processing


[Up] [Top]

Documentation for package ‘cleanNLP’ version 1.10.0

Help Pages

cleanNLP-package cleanNLP: A Tidy Data Model for Natural Language Processing
cleanNLP cleanNLP: A Tidy Data Model for Natural Language Processing
combine_documents Combine a set of annotations
dep_frequency Universal Dependency Frequencies
doc_id_reset Reset document ids
download_core_nlp Download java files needed for CoreNLP
extract_documents Extract documents from an annotation object
from_CoNLL Reads a CoNLL-U or CoNLL-X File
get_combine One Table Summary of an Annotation Object
get_coreference Access coreferences from an annotation object
get_dependency Access dependencies from an annotation object
get_document Access document meta data from an annotation object
get_entity Access named entities from an annotation object
get_sentence Access sentence-level annotations
get_tfidf Construct the TF-IDF Matrix from Annotation or Data Frame
get_token Access tokens from an annotation object
get_vector Access word embedding vector from an annotation object
init_coreNLP Interface for initializing the coreNLP backend
init_spaCy Interface for initializing up the spaCy backend
init_tokenizers Interface for initializing the tokenizers backend
obama Annotation of Barack Obama's State of the Union Addresses
pos_frequency Universal Part of Speech Code Frequencies
print.annotation Print a summary of an annotation object
read_annotation Read annotation files from disk
run_annotators Run the annotation pipeline on a set of documents
tidy_pca Compute Principal Components and store as a Data Frame
to_CoNNL Returns a CoNLL-U Document
word_frequency Most frequent English words
write_annotation Write annotation files to disk