Generate and visualize embeddings of ICD-10 codes. An embedding of the code is the simple average of embeddings of all visits assigned by this code. The visualization is generated on the 2D plane by t-SNE algorithm and plotted with ggplot. If t-SNE cannot be generated (because for example perplexity is too large for the number of samples), there are plotted two main components from PCA. The plot can be optionally saved to the given PDF file.

visualize_icd10(visits_vectors, visit_table, method = "tsne",
  save = FALSE, path_to_save)

Arguments

visit_table

A data frame with columns:

visit_id

icd10

ICD-10 code of the visit

method

One of "tsne" (default) or "pca" - a method of generating the plot

save

A logical indicating if the plot should be saved to the file

path_to_save

An optional string of the path to the target PDF file

visit_vectors

A matrix of embeddings of visits

Value

A generated plot of embeddings.

Examples

inter_term_vectors <- embed_terms(interviews, embedding_size = 10L, term_count_min = 1L)
#> Error in .subset2(public_bind_env, "initialize")(...): unused arguments (word_vectors_size = 10, vocabulary = list(c("fever", "rhinitis", "cough", "eye", "thyroid"), c(3, 3, 4, 4, 6), c(3, 3, 4, 4, 6)))
exam_term_vectors <- embed_terms(examinations, embedding_size = 10L, term_count_min = 1L)
#> Error in .subset2(public_bind_env, "initialize")(...): unused arguments (word_vectors_size = 10, vocabulary = list(c("fever", "man", "mother", "cough", "heart", "patient", "thyroid", "eye", "rhinitis", "woman", "father"), c(2, 2, 2, 3, 3, 3, 3, 4, 5, 6, 7), c(2, 2, 2, 3, 3, 3, 3, 4, 5, 6, 7)))
visits_vectors <- embed_list_visits(interviews, examinations, inter_term_vectors, exam_term_vectors)
#> Error in ncol(term_vectors): object 'inter_term_vectors' not found
visualize_icd10(visits_vectors, visits)
#> Error in visualize_icd10(visits_vectors, visits): object 'visits_vectors' not found