datamicroscopes/lda

语言: C++

git: https://github.com/datamicroscopes/lda

数据显微镜的潜在dirichlet分配(LDA)
Latent dirichlet allocation (LDA) for datamicroscopes
README.md (中文)

显微镜-LDA

用于在非结构化数据中查找未观察到的结构的Python包。

该包包含Teh等人在Hierarchal Dirichlet Processes(Journal of the American Statistical Association 101:pp.1566-1581)中描述的非参数(HDP)潜在Dirichlet分配(LDA)模型的实现。与原始LDA模型不同,非参数LDA不要求用户选择多个主题。相反,使用先前的分层Dirichlet过程从数据推断出主题的数量。

目前的核心遵循中国餐厅特许经营的第5.1节后验抽样中描述的抽样方案。将来,我们可能会支持Teh的论文中描述的其他内核。

数值计算在C ++中实现以提高效率。

安装

OS X和Linux版本的显微镜-lda发布到Anaconda.org。安装它们需要Conda。要安装当前发行版本,请运行:

$ conda install -c datamicroscopes -c distributions microscopes-lda

本文使用googletrans自动翻译,仅供参考, 原文来自github.com

en_README.md

microscopes-lda

A Python package for finding unobserved structure in unstructed data.

This package contains an implementation of the nonparametric (HDP) latent Dirichlet allocation (LDA) model described by Teh et al in Hierarchal Dirichlet Processes (Journal of the American Statistical Association 101: pp. 1566–1581). Unlike the original LDA model, nonparametric LDA does not require the user to select a number of topics. Instead, the number of topics is inferred from the data using a hierarchal Dirichlet process prior.

The current kernel follows the sampling scheme described in Section 5.1 Posterior sampling in the Chinese restaurant franchise. In the future, we may support the other kernels described in Teh's paper.

Numerical computation is implemented in C++ for efficiency.

Installation

OS X and Linux builds of microscopes-lda are released to Anaconda.org. Installing them requires Conda. To install the current release version run:

$ conda install -c datamicroscopes -c distributions microscopes-lda