Managing, Querying and Analyzing Tokenized Text


[Up] [Top]

Documentation for package ‘corpustools’ version 0.3.1

Help Pages

A B C D E F G K L M P R S T

-- A --

add_collocation_label Choose and add collocation strings based on collocation categories
as.tcorpus Force an object to be a tCorpus class
as.tcorpus.default Force an object to be a tCorpus class
as.tcorpus.tCorpus Force an object to be a tCorpus class

-- B --

backbone_filter Extract the backbone of a network.

-- C --

calc_chi2 Compute the chi^2 statistic for a 2x2 crosstab containing the values [a, b] [c, d]
code_features Code features in a tCorpus based on a search string
compare_corpus Compare tCorpus vocabulary to that of another (reference) tCorpus
compare_documents Calculate the similarity of documents
compare_subset Compare vocabulary of a subset of a tCorpus to the rest of the tCorpus
context Get a context vector
corenlp_tokens coreNLP example sentences
create_tcorpus Create a tCorpus
create_tcorpus.character Create a tCorpus
create_tcorpus.data.frame Create a tCorpus
create_tcorpus.factor Create a tCorpus

-- D --

deduplicate Deduplicate documents
delete_columns Delete column from the data and meta data
delete_meta_columns Delete column from the data and meta data
docfreq_filter Support function for subset method
dtm.tCorpus Create a document term matrix
dtm_compare Compare two document term matrices
dtm_wordcloud Plot a word cloud from a dtm

-- E --

ego_semnet Create an ego network

-- F --

feature_associations Get common nearby terms given a feature query
feature_stats Feature statistics
feature_subset Filter features
freq_filter Support function for subset method

-- G --

get Access the data from a tCorpus
get_global_i Compute global feature positions
get_meta Access the data from a tCorpus
get_stopwords Get a character vector of stopwords

-- K --

kwic Get keyword-in-context (KWIC) strings

-- L --

laplace Laplace (i.e. add constant) smoothing
lda_fit Estimate a LDA topic model

-- M --

merge_tcorpora Merge tCorpus objects

-- P --

plot.vocabularyComparison visualize vocabularyComparison
plot_semnet Visualize a semnet network
plot_words Plot a wordcloud with words ordered and coloured according to a dimension (x)
preprocess Preprocess feature
preprocess_tokens Preprocess tokens in a character vector
print.contextHits S3 print for contextHits class
print.featureHits S3 print for featureHits class
print.tCorpus S3 print for tCorpus class

-- R --

read_text Print tokens as text
refresh_tcorpus Refresh a tCorpus object using the current version of corpustools

-- S --

search_contexts Search for documents or sentences using Boolean queries
search_features Find tokens using a Lucene-like search query
search_recode Recode features in a tCorpus based on a search string
semnet Create a semantic network based on the co-occurence of tokens in documents
semnet_window Create a semantic network based on the co-occurence of tokens in token windows
set Modify the token and meta data.tables of a tCorpus
set_levels Change levels of factor columns
set_meta Modify the token and meta data.tables of a tCorpus
set_meta_levels Change levels of factor columns
set_meta_name Change column names of data and meta data
set_name Change column names of data and meta data
set_network_attributes Set some default network attributes for pretty plotting
sgt Simple Good Turing smoothing
sotu_texts State of the Union addresses
stopwords_list Basic stopword lists
subset Subset a tCorpus
subset_meta Subset a tCorpus
subset_query Subset tCorpus token data using a query
summary.contextHits S3 summary for contextHits class
summary.featureHits S3 summary for featureHits class
summary.tCorpus Summary of a tCorpus object

-- T --

tCorpus tCorpus: a corpus class for tokenized texts
tcorpus tCorpus: a corpus class for tokenized texts
tCorpus$code_features Code features in a tCorpus based on a search string
tCorpus$compare_corpus Compare tCorpus vocabulary to that of another (reference) tCorpus
tCorpus$compare_documents Calculate the similarity of documents
tCorpus$compare_subset Compare vocabulary of a subset of a tCorpus to the rest of the tCorpus
tCorpus$context Get a context vector
tCorpus$deduplicate Deduplicate documents
tCorpus$delete_columns Delete column from the data and meta data
tCorpus$delete_meta_columns Delete column from the data and meta data
tCorpus$dtm Create a document term matrix
tCorpus$feature_associations Get common nearby terms given a feature query
tCorpus$feature_stats Feature statistics
tCorpus$feature_subset Filter features
tCorpus$get Access the data from a tCorpus
tCorpus$get_meta Access the data from a tCorpus
tCorpus$kwic Get keyword-in-context (KWIC) strings
tCorpus$lda_fit Estimate a LDA topic model
tCorpus$preprocess Preprocess feature
tCorpus$read_text Print tokens as text
tCorpus$search_contexts Search for documents or sentences using Boolean queries
tCorpus$search_features Find tokens using a Lucene-like search query
tCorpus$search_recode Recode features in a tCorpus based on a search string
tCorpus$semnet Create a semantic network based on the co-occurence of tokens in documents
tCorpus$semnet_window Create a semantic network based on the co-occurence of tokens in token windows
tCorpus$set Modify the token and meta data.tables of a tCorpus
tCorpus$set_levels Change levels of factor columns
tCorpus$set_meta Modify the token and meta data.tables of a tCorpus
tCorpus$set_meta_levels Change levels of factor columns
tCorpus$set_meta_name Change column names of data and meta data
tCorpus$set_name Change column names of data and meta data
tCorpus$subset Subset a tCorpus
tCorpus$subset_meta Subset a tCorpus
tCorpus$subset_query Subset tCorpus token data using a query
tCorpus$top_features Show top features
tCorpus_compare Corpus comparison
tCorpus_create Creating a tCorpus
tCorpus_data Methods for viewing, modifying and subsetting tCorpus data
tCorpus_docsim Document similarity
tCorpus_features Preprocessing, subsetting and analyzing features
tCorpus_modify_by_reference Modify tCorpus by reference
tCorpus_querying Use Boolean queries to analyze the tCorpus
tCorpus_semnet Feature co-occurrence based semantic network analysis
tCorpus_topmod Topic modeling
tokens_to_tcorpus Create a tcorpus based on tokens (i.e. preprocessed texts)
tokenWindowOccurence Gives the window in which a term occured in a matrix.
top_features Show top features