phani1kumar/docker-titan

语言: Shell

git: https://github.com/phani1kumar/docker-titan

这个docker-titan存储库使您能够构建具有特定依赖关系的特定titan分支/特定titan标记...
This docker-titan repository enables you to build a specific titan branch / specific titan tag with specific dependen…
README.md (中文)

巨人图数据库码头图像

Titan是一个免开源的可扩展图形数据库,专门用于存储和查询包含分布在多机群集中的数百亿个顶点和边缘的图形。 Titan是一个事务性数据库,可以支持数千个并发用户实时执行复杂的图形遍历

此docker镜像实例化Titan图数据库,该数据库能够与ElasticSearch容器(索引)和Cassandra容器(存储)集成。

这里使用的概念可以扩展为支持任何其他存储或索引后端,因为您需要修改run.sh文件以满足您的需要。 titan docker图像构建了完整版的titan,因此对存储/索引引擎的支持是对配置的完全配置和修改以满足需要。

Titan的默认分布在单个节点上运行,使用此docker镜像,可以挂接到titan最常用的存储和索引依赖关系(分别是Cassandra和ElasticSearch)。

感谢Docker。根据数据库作者的建议,可以实现关注点的分离,同时共同定位后端和titan-server。

注意:Docker的使用未得到数据库作者的认可,但他们建议让存储节点和titan-server通过localhost进行通信。在这个docker场景中,我认为我们不需要考虑任何额外费用!

Titan

使用此项目,您可以根据自己的意愿从任何特定标签/分支构建titan。由Dockerfile中的配置直接支持:

# search for the following entries and modify them as necessary.
ENV TITAN_VERSION="0.9.0-SNAPSHOT"
ENV TITAN_BRANCH="titan09"

您甚至可以使用titan代码分支中的特定提交来构建特定容器。为此,您需要更新执行git checkout的RUN命令。

如果你想从你的fork中构建titan,因为你对forke中的开源代码做了一些更改?不用担心,只需更改从中获取包的容器中的git存储库路径即可完成。

您可以在其页面或实时论坛中找到有关泰坦的更多详细信息。

注意:如果您希望从旧版本/分支/标记构建titan,请确保您在Dockerfile中具有正确版本的JDK。如果你现代足以使用Docker,我不希望你构建一个旧版本的titan。当然,这是你的事业和你的决定:)

Tinkerpop和Gremlin

TinkerPop3在Apache2许可下为图形数据库(OLTP)和图形分析系统(OLAP)提供图形计算功能。

此项目使您可以自定义要依赖的Tinkerpop3版本。当前实现始终从主服务器获取最新分支并构建它。

如果您需要特定的标签/分支,请随意使用git checkout的titan部分修改Dockerfile作为示例。如果您甚至不想处理构建Tinkerpop3并希望直接使用titan,请将标志更改为YES以外的值。

ENV CUSTOMIZE_TINKERPOP="YES"

运行

此堆栈的最低系统要求是1 GB,2个内核。

在我们的例子中运行所需的外部依赖项:Cassandra和ElasticSearch。以下是运行theese节点的片段,但我鼓励你去相应的github项目获取最新的相关说明

# I am using the 2.0 version of cassandra and 1.5 version of elasticsearch. 
# At the moment these are the versions supported by titan-0.9.0-M2. 
# If the versions change, please feel free to use the correct versions
docker run --name cas1 -d cassandra:2.0
docker run -d --name elas1 elasticsearch:1.5
docker run -d -P --name mytitan --link elas1:elasticsearch --link cas1:cassandra <YOUR TITAN IMAGE>

如果您希望将Cassandra / Elasticsearch作为自己的集群运行,请参阅上述项目的文档。为简单起见,我在这里省略了这个主题。

在docker容器中运行时,我建议您在主机文件系统中安装数据目录。您可以这样做:

docker run -d --name cas1 -v /mnt/Share/titandb/cassdata:/var/lib/cassandra/data cassandra:2.0
docker run -d --name elas1 -v /mnt/Share/titandb/es_index_data:/usr/share/elasticsearch/data elasticsearch:1.5

以下是整个事物看起来如何的视觉描绘:

还有一点需要注意的是,当你使用如上所述的docker容器和已安装的文件系统时,你应该知道stackoverflow线程中提到的事实。我已经花了两天时间在灌木丛中打击,想出这个问题。我遇到的问题的详细信息。

使用docker-compose在Linux上运行

docker-compose up

要测试服务器是否正在运行(使用websockets),请运行以下命令:

bin/test

如果你看到这一点 - 一切都很好:

HTTP/1.1 101 Web Socket Protocol Handshake
Upgrade: WebSocket
Connection: Upgrade
WebSocket-Origin: http://localhost:8182
WebSocket-Location: ws://localhost/gremlin

要访问gremlin控制台:

bin/gremlin

端口

8182:Websocket端口(如果您在run.sh文件中使用Websocket(默认)版本) 8182:REST API的HTTP端口(如果您在run.sh文件中使用HTTP / JSON版本)

8184:JMX端口(可能不需要使用它)

要测试REST API(通过Boot2docker):

curl "http://docker.local:8182?gremlin=100-1"
curl "http://docker.local:8182?gremlin=g.addV('Name','Eric')"
curl "http://docker.local:8182?gremlin=g.V()"

对于Websocket测试,您需要有一个合适的应用程序/ gremlin-console。在撰写本文时,带有titan-0.9.0-M2的gremlin-console在远程连接方面存在一些问题。因此,我使用通过gremlin-driver连接的简单Scala应用程序进行了测试。

依赖

我用以下容器测试了这个容器:

- docker-library/cassandra: This is the Cassandra Storage backend for Titan
- docker-library/elasticsearch: This is the ElasticSearch Indexing backend for Titan. It provides search capabilities for Titan graph datasets.

障碍面临,我是如何克服的

  1. 当使用具有已安装文件系统的docker容器时,您应该了解stackoverflow线程中提到的事实。我已经花了两天时间在灌木丛中打击,想出这个问题。我遇到的问题的详细信息。
  2. 无法通过gremlin-driver连接到titan-server。斯蒂芬在Tinkerpop3-815的建议已经解决了这个问题

本文使用googletrans自动翻译,仅供参考, 原文来自github.com

en_README.md

Titan graph databse docker image

Titan is an opensource free scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support thousands of concurrent users executing complex graph traversals in real time

This docker image instantiaties a Titan graph database that is capable of integrating with an ElasticSearch container (Indexing) and a Cassandra container (Storage).

The concepts used here can be extended to support any other storage or indexing backend, for that you would need to modify the run.sh file to suit your needs. The titan docker image builds the full version of titan, hence support to the storage / indexing engine is mearly configuration and modifications of configuration to suit the needs.

The default distribution of Titan runs on a single node, with this docker image, it is possible to hook onto the mostly used storage and indexing dependencies of titan (Cassandra and ElasticSearch respectively).

Thanks to Docker. It is possible to realize a separation of concerns at the same time co-locate the backend and titan-server as suggested by the authors of the database.

Note: the use of Docker is not endorsed by the authors of the database, but they suggested to have the storage node and titan-server communicate over localhost. In this docker scenario, I don't think there is any additional cost we have to look at!!

Titan

Using this project, you can build titan from any specific tag / branch as per your wish. Supported directly by the configuration in the Dockerfile:

# search for the following entries and modify them as necessary.
ENV TITAN_VERSION="0.9.0-SNAPSHOT"
ENV TITAN_BRANCH="titan09"

You may even buid a specific container using a specific commit from titan code branch. For this you would need to update the the RUN command where the git checkout is performed.

If you want to build the titan from your fork, because you have done some changes to the open source code in your forke? no worries, just change the git repository path in the container where it is obtaining the package from and you are done.

You can find more details about titan at its page or the live forum.

Note: If you wish to build titan from a older version/ branch/ tag, please make sure that you have the correct version of the JDK in the Dockerfile. If you are modern enough to use Docker, I wouldn't expect you to build a older version of titan though. Ofcourse, it is your business and your decision :)

Tinkerpop and Gremlin

TinkerPop3 provides graph computing capabilities for both graph databases (OLTP) and graph analytic systems (OLAP) under the Apache2 license.

This project enables you to customize the version of Tinkerpop3 you want to depend upon. The current implementation always fetches the latest branch from the master and builds it.

In case you need a specific tag/ branch, please feel free to modify the Dockerfile using the titan portion of the git checkouts as example. If you don't even want to deal with building Tinkerpop3 and wish to directly titan, please change the flag to value other than YES.

ENV CUSTOMIZE_TINKERPOP="YES"

Running

The minimum system requirements for this stack is 1 GB with 2 cores.

Run the required external dependencies, in our case : Cassandra and ElasticSearch. Following are the snippets to run theese nodes, but I would encourage you to go to the respective github projects to get the latest and relevant instructions

# I am using the 2.0 version of cassandra and 1.5 version of elasticsearch. 
# At the moment these are the versions supported by titan-0.9.0-M2. 
# If the versions change, please feel free to use the correct versions
docker run --name cas1 -d cassandra:2.0
docker run -d --name elas1 elasticsearch:1.5
docker run -d -P --name mytitan --link elas1:elasticsearch --link cas1:cassandra <YOUR TITAN IMAGE>

If you wish to run Cassandra / Elasticsearch as clusters of their own, please refer to the documentation from the above mentioned project. For simplicity, I am omitting that topic here.

When running in docker containers, I would encourage you to mount the data directories in your host filesystem. This you can do as follows:

docker run -d --name cas1 -v /mnt/Share/titandb/cassdata:/var/lib/cassandra/data cassandra:2.0
docker run -d --name elas1 -v /mnt/Share/titandb/es_index_data:/usr/share/elasticsearch/data elasticsearch:1.5

Following is a visual depiction of how the whole thing would look like:
docker_topology

One more thing to note is, when you are using the docker containers as above with mounted file system, you should be aware of the facts mentioned in the stackoverflow thread. I've spent good 2 days hitting around the bush to figure this out. Details of the issue I've faced.

Run on Linux with docker-compose

docker-compose up

To test that the server is running (with websockets) run this:

bin/test

if You see this - everything is good:

HTTP/1.1 101 Web Socket Protocol Handshake
Upgrade: WebSocket
Connection: Upgrade
WebSocket-Origin: http://localhost:8182
WebSocket-Location: ws://localhost/gremlin

To access the gremlin console:

bin/gremlin

Ports

8182: Websocket port (incase you are using the Websocket(default) version in the run.sh file)
8182: HTTP port for REST API (incase you are using the HTTP / JSON version in the run.sh file)

8184: JMX Port (You won't need to use this, probably)

To test out the REST API (over Boot2docker):

curl "http://docker.local:8182?gremlin=100-1"
curl "http://docker.local:8182?gremlin=g.addV('Name','Eric')"
curl "http://docker.local:8182?gremlin=g.V()"

For Websocket testing, you would need to have a proper application / gremlin-console. At the time of this writing, the gremlin-console coming with titan-0.9.0-M2 has some issues for remote connection. Hence I've tested using a simple Scala application connecting through gremlin-driver.

Dependencies

I've tested this container with the following containers:

- docker-library/cassandra: This is the Cassandra Storage backend for Titan
- docker-library/elasticsearch: This is the ElasticSearch Indexing backend for Titan. It provides search capabilities for Titan graph datasets.

Hurdles faced and how did I overcome

  1. When using using the docker containers with mounted file system, you should be aware of the facts mentioned in the stackoverflow thread. I've spent good 2 days hitting around the bush to figure this out. Details of the issue I've faced.
  2. Not being able to connect to the titan-server through gremlin-driver. This has been addressed by the suggestion from stephen at the Tinkerpop3-815