gregs1104/peg

语言: Shell

git: https://github.com/gregs1104/peg

PostgreSQL环境生成器
PostgreSQL Environment Generator
README.md (中文)

什么是挂钩?

peg代表PostgreSQL环境生成器。这是一个相当复杂的 自动执行安装测试所需的大多数重复工作的脚本 PostgreSQL的实例。它主要针对正在构建的开发人员 从源代码管理(git,cvs)到试用版的审阅者 源代码包(.tar.gz),包括发布源快照和 alpha / beta版本。它适用于已经拥有的类UNIX系统 编译PostgreSQL所需的其余构建工具。

动机

peg的使用在很大程度上依赖于环境变量。它尽可能多 镜像两个常见的PostgreSQL设置的环境设置:psql和 程序的RPM打包。

psql客户端和使用libpq库的其他程序查找 环境变量如PGPORT来改变它们的行为。服务器 启动也会执行其中一些操作,例如使用PGDATA设置数据库 位置。

RedHat的PostgreSQL服务器使用的RPM包装使用了源代码 环境存储在/ etc / sysconfig / pgsql / postgresql [版本]中 可以定义像PGDATA这样的设置。使用时一个有用的习惯用法 postgresql是拥有postgres用户和其他用户的登录配置文件 使用Postgres来源这个文件,因此具有完全相同的 用于启动服务器的设置。该文件可能如下所示::

PGENGINE =的/ usr /的pgsql-9.1 / bin中   PGPORT = 5432   PGD​​ATA =的/ var / lib中/ pgsql的/ 9.1 /数据   PGLOG =的/ var / lib中/ pgsql的/ 9.1 / pgstartup.log

通常,要使运行数据库的用户可以使用这些文件 postgres,该用户的登录配置文件将执行shell命令 这个::

源/etc/sysconfig/pgsql/postgresql-9.1

这个想法是,一旦你设置了一个peg,环境的环境 您可以获得的变量将与您获得的变量相匹配 登录到运行标准PostgreSQL的生产RHEL服务器 RPM设置。一旦peg设置了您的环境,就会执行以下命令 将工作,就像你在源上发送sysconfig文件一样 生产RHEL服务器::

pg_ctl start -l $ PGLOG   $ EDITOR $ PGDATA / postgresql.conf

无论您是否觉得有用,您还可以获得源代码构建和 通过使用peg自动创建集群,设置和控制。

peg采用类似于常见版本控制系统的命令:你打电话 带有子命令的一个可执行脚本,后跟您想要的内容 操作,现在始终是项目名称。

安装

peg是迄今为止的一个剧本;它唯一的依赖是Bash。您 可以只将它复制或链接到路径中的任何目录,例如 as / usr / local / bin用于系统范围的安装。这是一个样本 检查工作副本的安装然后 链接到PATH中找到的标准目录中的该副本 在Linux系统上::

$ cd $ HOME   $ git clone git://github.com/gregs1104/peg.git   $ sudo ln -s $ HOME / peg / peg / usr / local / bin /

创建第一个项目

使peg工作的最简单方法是创建一个$ HOME / pgwork目录 让peg为你管理整件事::

#git repo setup example(你必须运行bash)   光盘   mkdir -p pgwork   peg init测试   。挂钉   PSQL

此时,您的唯一git存储库将切换到新的分支 匹配项目的名称。

目录布局

peg假设您以普通用户身份安装Postgres数据库, 通常在您的主目录中。它允许你有多个“项目” 安装在这个结构中,每个都有自己的源代码,二进制文件, 和数据库。

上面的例子将为您提供一个如下所示的目录树 (这简化为仅包括基本轮廓,将有 这些下面的更多目录)::

pgwork /   | - 数据   | - 测试   | - inst   | - 测试   | - bin   | - repo   | - git - src--测试

托管的每个项目只有一个repo目录 这个工作区,每个项目都会签出自己独特的副本 回到它的工作区 - 除非你使用git;见下文。

顶级目录的工作方式如下:

  • pgwork / repo:主仓库的一份副本所有项目共享
  • pgwork / src / project:用这个项目的源代码检查了repo的副本
  • pgwork / inst / project:通过“make install”步骤构建的二进制文件
  • pgwork / data / project:保存initdb创建的数据库集群

快捷键

一旦你完成了环境设置,再次使用它可以像::

。挂钩开关

这将使用你最后给脚本的任何项目名称并启动 服务器了。如果关闭系统,那应该就是所需要的 把事情重新带回来。

当您获取peg脚本时,以及获取PGDATA等设置 它还可以创建一些你可以使用的快速别名 为这些功能调用peg:

  • start:如果您已经有数据库群集设置,请启动它
  • stop:使用慢速关闭代码停止正在运行的数据库
  • immediate:立即停止正在运行的数据库

在这里,这些名字被选中与之相似 “service postgresql start”和停止RPM包装使用的命令。 在某些UNIX系统上使用start和stop进行作业控制或系统 初始化。在做PostgreSQL时你不太可能需要那些 也工作,所以重新使用这些命令应该可以节省一些打字。

使用peg进行性能测试

peg使用的默认构建方法包括断言 大大降低了生成的服务器代码的速度。如果你 想要构建没有断言和调试信息,你需要 将PGDEBUG设置为非空值。那将会传递给 PostgreSQL“配置”程序没有打开任何一个 调试功能。一个空间适用于此,例如::

export PGDEBUG =“”

在构建步骤之前将使用标准构建选项,而不是 调试服务器减慢服务器的速度。

与git细节挂钩

git的源代码布局

如果你使用git作为你的repo,src /目录只是一个符号链接 回购股票本身,以便每个项目共享相同的回购。每个项目 当你第一次使用“peg init”时,它会被赋予自己的分支。这似乎 匹配git用户的工作流程比检查单独的副本更好 为每个项目。这很简单,可以在init和。之间进行更改 构建步骤,您可以删除符号链接并手动复制主仓库。

TODO:提供一个例子

如果您打算使用单个git repo处理多个项目,请make 请务必注意下面的“已知问题”部分 常见问题,以及如何解决它们。

将补丁应用于git repo项目

以下是使用git为基础repo测试补丁的方法::

peg init测试   cd pgwork / src / test   git应用some.patch   。挂钉   PSQL

某些补丁不是由git的apply处理的。如果失败了, 尝试以下代替::

patch -p1 <some.patch

在这种情况下传递给“-p”的参数可以变化; 0也很常见。 您需要能够阅读补丁以预测它应该是什么。

有关更多信息,请参见http://wiki.postgresql.org/wiki/Working_with_Git 有关如何处理补丁以及其他方面的信息 PostgreSQL加git使用。

示例tgz会话

以下是使用peg测试alpha或beta版本的方法 以源代码形式下载。要做到这一点,而不是使用常规 repo,你只需要创建一个tgz repo目录并删除该文件 到那里::

#回购设置:tgz   光盘   mkdir -p pgwork / repo / tgz   cp postgresql-9.1alpha1.tar.gz pgwork / repo / tgz

#基本安装   peg init测试   。挂钉   PSQL

带补丁的cvs或tgz repo

以下是使用CVS为基础repo测试补丁的方法::

peg init测试   cd pgwork / src / test   patch -p 0 <some.patch   。挂钉   PSQL

TODO:测试以上内容

示例cvs会话

您可以通过更改默认值来克隆postgresql.org cvs repo PGVCS是cvs ::

光盘   mkdir -p pgwork

#回购设置:cvs   export PGVCS = csv   peg init测试   。挂钉   PSQL

这将通过rsync与主PostgreSQL git服务器同步使用 相同的技术记录在 http://wiki.postgresql.org/wiki/Working_with_CVS (“初始设置”部分中给出的轮廓实际上是peg的距离 祖先)你可能想要使用CVS的主要原因是你 在无法安装git的旧服务器上进行开发。

您也可以通过创建repo / cvs目录轻松强制执行此操作::

光盘   mkdir -p pgwork / repo / cvs   peg init测试   。挂钉   PSQL

示例双群集会话

这是一个复杂的挂钉安装。目的是启动两个数据库 共享相同源代码和二进制集的集群,可能用于测试 在单个服务器上复制两个“节点”。这相对容易 脚本,使用peg来完成这里通常需要的大部分脏工作::

#双节点集群设置   peg init master   peg init slave

#让slave使用与slave相同的源代码和二进制文件   pushd pgwork / inst   rm -rf奴隶   ln -s主奴隶   POPD

pushd pgwork / src   rm -rf奴隶   ln -s主奴隶   POPD

#启动主人   peg build master   #无法上传源,因为PGDATA将被设置   #启动奴隶   export PGPORT = 5433; peg start slave; export PGPORT =   。挂钩开关大师

psql -p 5432 -c“show data_directory”   psql -p 5433 -c“show data_directory”

请注意,如果您现在尝试像这样停止奴隶::

peg stop奴隶

这实际上不起作用,因为它仍将使用PGDATA 您采购的环境变量。相反,您需要这样做::

未设置PGDATA PGLOG   。挂钩开关奴隶   钉住

TODO:以上仍然不起作用。但是如果你开始一个全新的shell, 这似乎很好。

示例向后移植设置

Backporting涉及从较新版本的程序中获取代码 将它应用于较早的一个。对于这个例子,想象一下目标 是将为PostgreSQL 9.2开发的补丁应用于9.1版。这个 只有你使用默认的git存储库才能在peg中工作 易于签出任何版本的数据库代码。

向后移植设置就像常规git会话一样,只是设置 首先是PGVERSION环境变量::

export PGVERSION =“9.1”   光盘   mkdir -p pgwork   peg init测试

现在,您可以对从更高版本到源的更改进行更改 代码,使用类似上面的补丁应用程序示例。在一些 例如,在编译PostgreSQL之前必须应用源更改。 可以在构建步骤之后应用和处理许多源代码补丁 太。

如果您正在寻找的更改是添加到以后的PostgreSQL中的功能 版本,您可以将它应用于您已签出的旧版本 使用“git cherry-pick”代替。

完成所有更改后,构建数据库源代码并测试::

。挂钉   PSQL

基目录检测

整个peg目录树基于推荐的目录 命名为pgwork。如果您使用其他目录,则可以使用该脚本 它通过设置PGWORK环境变量。搜索序列找到 工作区域是:

  1. 为PGWORK传递的值
  2. 当前目录
  3. $ HOME / pgwork
  4. $ HOME

当有“回购”时,peg认为它找到了正确的工作区域 其中一个位置的子目录。

命令摘要

peg接受以下子命令:

  • 状态:报告您可能存在的环境变量    执行命令主要用于故障排除。
  • init:创建一个repo和一个基于它的项目(如果有一个名称)
  • 更新:如果已命名,则更新您的仓库和基于它的项目
  • build-only:执行构建步骤并创建数据库集群,但是    不要启动它。这通常适用于您知道需要修改的情况    启动之前的数据库配置。
  • 构建:构建二进制文件,安装它们,创建集群,启动数据库
  • rebuild:重新构建并安装服务器的主二进制文件    src / backend目录。仅对核心服务器进行更改时    代码,这可以节省完成构建的时间。
  • initdb:创建一个集群
  • switch:切换到现有的二进制集和集群
  • start:启动集群
  • 停止:停止群集
  • rm:从项目中删除所有数据(但不包括repo)

环境变量引用

你可以看到peg在内部使用的主要环境变量::

挂钩状态

只有在未明确设置的情况下,才会自动设置所有这些值 他们先做点什么。这允许你做一些事情,比如使用peg 管理你的源代码和二进制构建,同时还有一个PGDATA 指向您希望数据库运行的单独位置。

  • PGPORT:客户端程序使用它来确定要连接的端口;    如果设置,任何“peg start”命令将启动该端口上的服务器。看到    关于如何使用它的多集群示例。
  • PGVCS:有效选项是“cvs”,“git”和“tgz”。如果你有更多    您可以在repo目录中使用一种类型的源    用它来指定你应该使用哪一个。
  • PGWORK:工作区的基本目录。请参见“基目录检测”。
  • PGPROJECT:如果设置了此项,它将成为使用的项目名称    对于所有命令,无论命令行传递了什么。
  • PGD​​EBUG:默认情况下,peg使用标准标志构建PostgreSQL    你想用于开发和测试。
  • PGMAKE:运行GNU make的程序。这默认为“make”但可以    覆盖。
  • PGVERSION:当一个新项目时,PostgreSQL的稳定版本要结账    是用“peg init”创建的。如果没有指定,则为“master”分支    使用存储库。仅在peg针对git repo运行时才有用。    这将接受标准版本号,如“9.1”或版本    与“9_1”等实际稳定分支命名约定匹配的名称。

Solaris使用

已知peg的默认值在典型的Solaris上构建问题 系统,其中GNU构建工具链不一定是默认工具链。 这是您可以在环境中进行更改的示例配置 默认情况下,在该平台上工作,假设您已经安装了 Sun免费GNU工具的默认位置::

export PGMAKE =“/ usr / sfw / bin / gmake”   export PGDEBUG =“ - enable-cassert --enable-debug --without-readline --with-libedit-preferred”

已知的问题

请参阅peg源代码(甚至本文档)中的TODO注释 在代码中打开问题。其中一些变成了功能问题 应该知道。

git分支和清理

使用git时,peg将所有项目链接到一个git目录,每个目录都有 项目被视为一个分支。该计划希望你能管理 复杂的操作在你自己,而不是试图强制git 可能具有破坏性的变化。这可能导致一个方面 很多问题是如果你试图切换到新的原始分支, 例如在使用PGVERSION变量时。从默认分支移动 (origin / master)到另一个版本,或反过来,通常需要 在运行“peg init”之前,需要手动清理git checkout。

您总是要小心将任何有效的代码提交给您的活动代码 尝试更改为新项目之前的分支,因此是一个新的git 科。检查存储库检出状态是一个好习惯 在运行“peg init”之前采用以尝试创建一个新项目。

您可以检查您的git结帐是否已完全清理 - 因此 能够通过抱怨接受分支变更 - 通过查看其状态是否正常 喜欢这个::

$ git状态   #On分支主人   无需提交(工作目录清理)

如果您在那里看到修改或未跟踪的文件,则会尝试更改结帐 原始分支版本不太可行。如果没有清理,则会出现典型错误 在“peg init”之前正确完成许多“需要合并”警告结束 与::

错误:您需要先解决当前索引

导致此类问题的大多数文件都可以通过进入清理来清理 src目录 - 使用peg时只是指向repo目录的链接 git,你可以去那里 - 并执行::

git reset --hard

这不会删除已添加的新文件,但仍可以 引起你的问题。例如,如果也使用了相同的文件名 在你正在检查的新分支中,这会给你带来麻烦。 开发人员检查可以获得这种“未跟踪文件”的一种常见方式 如果您构建ctags以帮助导航源代码。所有新文件和 可以使用::删除目录

git clean -f -d

移动你从git中添加的任何重要文件非常重要 这样使用“git clean”之前的目录树。它将擦除一切 除了预期的存储库文件。

严重的问题

到目前为止,只有在您可能遇到的情况下,这些才是严肃的 他们和他们造成的问题很烦人。但要避免的变通办法 每个都很简单。

  • 如果你正在针对一个项目运行,那么创建一个新项目,就是这样   容易进入环境变量和其他信息的状态   由旧项目设置继续徘徊。如果你正在使用git   repo代码,这很可能发生,因为   切换项目仅在单个共享结账中切换分支   回购。这不会删除源代码构建的部分   引用旧项目的配置:配置阶段   例如,保存二进制文件的存储位置。   因此,使用git时建议的工作流程是:: 停止  挂干净  peg init newproject  [启动新的终端会话以清除所有环境变量]  挂钉  。挂钩开关
  • peg有一个概念,你可以直接设置PGDATA,而不是想要那样   特定目录结构在同一个PGWORK区域中的一切   别的是。当你将peg源化到你的环境中使用时   一个项目,这设置了PGDATA。这种组合导致了一个重大问题   当切换实际上都在PGWORK中托管的项目时   结构体。你将从原始项目获得PGDATA,并且   你要切换到的那个人会相信这是一个人工设置的PGDATA   应该用。所以其他一切都将切换到新项目,   除了数据库目录,这是令人困惑的。这个问题   最终将在代码中解决。要解决   现在,在做“peg switch”之前你应该擦除PGDATA(和PGLOG,   遭受同样问题):: 未设置PGDATA PGLOG

琐碎的错误

  • peg创建一个与您的名字匹配的数据库,这是psql想要的    默认。它不会检查它是否已经存在,所以你会    在启动数据库时经常会看到有关该错误的错误。这是无害的。
  • 如果您反复获取peg输出,它将污染您的PATH    指向同一目录树的多个指针。这大部分都是无害的    减慢在PATH中找到命令的速度。

文档

该程序的文档README.rst在ReST标记中。工具 在ReST上运行可用于使其版本格式化 用于其他目的,例如rst2html来制作HTML版本。

联系

该项目位于http://github.com/gregs1104/peg

如果您有任何提示,更改或改进,请联系:

  • 格雷格史密斯gsmith@gregsmith.com

积分

版权所有(c)2009-2013,格雷戈里史密斯 版权所有。 有关完整许可证详细信息,请参阅COPYRIGHT文件

peg是由Greg Smith编写的,用于制作所有PostgreSQL系统 通常可以在控制台上更接近通用UI。 peg的目录布局和一般设计灵感来自几个成员 PostgreSQL社区,包括:

  • Heikki Linnakangas,概述了他的个人工作习惯   与CVS回购交互,激发了Greg的灵感   写下原来的“CVS + rsync解决方案”部分 http://wiki.postgresql.org/wiki/Working_with_CVS
  • 艾伦李,杰夫戴维斯和特鲁维索的其他成员,我可能不会   直接记住从这两个人那里借来的想法。艾伦   和杰夫都有自己的方式来组织PostgreSQL安装   在他们各自的主目录中,我发现有趣的时候   在项目上一起工作。

本文使用googletrans自动翻译,仅供参考, 原文来自github.com

en_README.md

What is peg?

peg stands for PostgreSQL Environment Generator. It's a fairly complicated
script that automates most of the repetitive work required to install test
instances of PostgreSQL. It's primarily aimed at developers who are building
from source control (git, cvs) and toward reviewers who are trying out a
source code package (.tar.gz), including release source snapshots and the
alpha/beta builds. It works on UNIX-like systems that already have
the rest of the build tools necessary for compiling PostgreSQL installed.

Motivation

peg usage relies heavily on environment variables. As much as possible, it
mirrors the environment setups of two common PostgreSQL setups: psql and
the RPM packaging of the program.

The psql client and other programs that use the libpq library look for
environment variables such as PGPORT to change their behavior. Server
startup does some of this as well, such as using PGDATA to set the database
location.

The RPM packaging used by RedHat's PostgreSQL server uses a sourced
environment stored in /etc/sysconfig/pgsql/postgresql[version] which
can define settings like PGDATA. One useful idiom when working with
postgresql is to have the login profile of the postgres user and others
using Postgres to source this file, and therefore have the exact same
settings used to start the server. That file might look like this::

PGENGINE=/usr/pgsql-9.1/bin
PGPORT=5432
PGDATA=/var/lib/pgsql/9.1/data
PGLOG=/var/lib/pgsql/9.1/pgstartup.log

To make these available to the user running the database, typically
postgres, that user's login profile would execute a shell command like
this::

source /etc/sysconfig/pgsql/postgresql-9.1

The idea is that once you've setup an environment with peg, the environment
variables you'll have available to you will match what you'd get if you
were logging into a production RHEL server running the standard PostgreSQL
RPM set. Once peg has setup your environment, commands like the following
will work, exactly the same as if you'd sourced the sysconfig file on
a production RHEL server::

pg_ctl start -l $PGLOG
$EDITOR $PGDATA/postgresql.conf

Whether or not you find that useful, you'll also get source code builds and
cluster creation, setup, and control automated by using peg.

peg takes commands similarly to common version control systems: you call
the one executable script with a subcommand, followed by what you want it to
operate on, which right now is always a project name.

Installation

peg is so far a single script; its only dependency is Bash. You
can just copy or link it to any directory that's in your path, such
as /usr/local/bin for a system-wide install. Here's a sample
installation that checks out a working copy and then
links to that copy in a standard directory found in the PATH
on Linux systems::

$ cd $HOME
$ git clone git://github.com/gregs1104/peg.git
$ sudo ln -s $HOME/peg/peg /usr/local/bin/

Creating a first project

The easiest way to make peg work is to create a $HOME/pgwork directory and
let peg manage the whole thing for you::

# git repo setup example (you must be running bash)
cd
mkdir -p pgwork
peg init test
. peg build
psql

At this point your sole git repository will be switched to a new branch that
matches the name of the project.

Directory layout

peg assumes you are installing your Postgres database as a regular user,
typically in your home directory. It allows you have to multiple "projects"
installed in this structure, each of which gets its own source code, binaries,
and database.

The example above will get you a directory tree that looks like this
(this is simplified to only include the basic outline, there will be
more directories below these)::

pgwork/
|-- data
| -- test |-- inst |-- test
| -- bin |-- repo |-- git
-- src-- test

There will only be one repo directory for each of the projects hosted in
this work area, and each project will checkout its own unique copy of that
repo into its work area--unless you are using git; see below.

The top level directories are intended to work like this:

  • pgwork/repo: The one copy of the master repo all projects share
  • pgwork/src/project: Checked out copy of the repo with this project's source
  • pgwork/inst/project: Binaries build by the "make install" step
  • pgwork/data/project: Hold the database cluster created by initdb

Shortcuts

Once you've gotten an environment setup, using it again can be as easy as::

. peg switch

That will use whatever project name you last gave the script and start the
server up. If you shutdown your system, that should be all that's needed
to bring things back up again.

When you source the peg script, along with getting settings like PGDATA
available it also creates a couple of quick aliases you can use instead
of calling peg for those functions:

  • start: If you already have a database cluster setup, start it
  • stop: Stop a running database with the slow shutdown code
  • immediate: Stop a running database immediately

Here again, the names were picked to be similar to the
"service postgresql start" and stop commands used by the RPM packaging.
start and stop are used on some UNIX systems for job control or system
initialization. It's unlikely you'll need those while doing PostgreSQL
work too, so re-using those commands for this should save you some typing.

Use peg for performance testing

The default build method used by peg includes assertions, which will
slow down the speed of the resulting server code considerably. If you
want to build without assertions and debugging information, you'll need
to set PGDEBUG to a non-empty value. That will be passed through to
the PostgreSQL "configure" program without turning on any of the
debugging features. A space works for this, for example::

export PGDEBUG=" "

Before the build step will use the standard build options, rather than
the debugging ones that slow the server down.

peg with git details

Source code layout for git

If you are using git for your repo, the src/ directory is just a symbolic link
to the repo itself, so that every project shares the same repo. Each project
is instead given its own branch when you first use "peg init". This seems
to match the workflow of git users better than checking out a separate copy
for each project. This is simple enough to change: in between the init and
build steps, you can remove the symlink and manually copy the master repo.

TODO: Provide an example of that

If you intend to work on multiple projects using a single git repo, make
sure you note the "Known Issues" section below for caveats about
common problems, and how to resolve them.

Applying a patch to a git repo project

Here's how you might test a patch using git for the base repo::

peg init test
cd pgwork/src/test
git apply some.patch
. peg build
psql

Some patches aren't handled by git's apply. If that fails with errors,
try the following instead::

patch -p1 < some.patch

The parameter passed to "-p" in this case can vary; 0 is also common.
You'll need to be able to read the patch to predict what it should be.

See http://wiki.postgresql.org/wiki/Working_with_Git for more
information about how to deal with patches, as well as other aspects of
PostgreSQL plus git use.

Sample tgz session

Here's how you might use peg to test out an alpha or beta build
downloaded in source code form. To do that instead of using a regular
repo, you merely need to create a tgz repo directory and drop the file
into there::

# Repo setup: tgz
cd
mkdir -p pgwork/repo/tgz
cp postgresql-9.1alpha1.tar.gz pgwork/repo/tgz

# Basic install
peg init test
. peg build
psql

cvs or tgz repo with patch

Here's how you might test a patch using CVS for the base repo::

peg init test
cd pgwork/src/test
patch -p 0 < some.patch
. peg build
psql

TODO: Test the above

Sample cvs session

You can clone the postgresql.org cvs repo just by changing your default
PGVCS to be cvs::

cd
mkdir -p pgwork

# Repo setup: cvs
export PGVCS=csv
peg init test
. peg build
psql

This will synchronize with the master PostgreSQL git server via rsync, using
the same techniques documented at
http://wiki.postgresql.org/wiki/Working_with_CVS
(The outline given in its "Initial setup" section is actually peg's distant
ancestor) The main reason why you might want to use CVS is if you
are doing development on an older server where git cannot be installed.

You can easily force this just by creating a repo/cvs directory too::

cd
mkdir -p pgwork/repo/cvs
peg init test
. peg build
psql

Sample two-cluster session

Here is a complicated peg installation. The intent is to start two database
clusters that shared the same source code and binary set, perhaps for testing
replication with two "nodes" on a single server. This is relatively easy
to script, using peg to do most of the dirty work normally required here::

# Two node cluster setup
peg init master
peg init slave

# Make the slave use the same source code and binaries as the slave
pushd pgwork/inst
rm -rf slave
ln -s master slave
popd

pushd pgwork/src
rm -rf slave
ln -s master slave
popd

# Start the master
peg build master
# Can't source the above yet, because then PGDATA will be set
# Start the slave
export PGPORT=5433 ; peg start slave ; export PGPORT=
. peg switch master

psql -p 5432 -c "show data_directory"
psql -p 5433 -c "show data_directory"

Note that if you now try to stop the slave like this::

peg stop slave

This won't actually work, because it will be still using the PGDATA
environment variable you sourced in. Instead you need to do this::

unset PGDATA PGLOG
. peg switch slave
peg stop

TODO: The above still doesn't work. But if you start a whole new shell,
that seems to be fine.

Sample backporting setup

Backporting involves taking code from a newer version of a program and
applying it to an earlier one. For this example, imagine that the goal
is to apply a patch developed for PostgreSQL 9.2 to version 9.1. This
only works in peg if you are using the default git repository, where it's
easy to checkout any version of the database code.

A backporting setup works just like a regular git session, just setting
the PGVERSION environment variable first::

export PGVERSION="9.1"
cd
mkdir -p pgwork
peg init test

Now you can make the changes you have from a later version to the source
code, using something like the patch application example above. In some
cases, source changes must be applied before compiling PostgreSQL at all.
Many source code patches can be applied and worked on after a build step
too.

If the change you're looking for is a feature added to a later PostgreSQL
version, you might apply it to the older version you have checked out
using "git cherry-pick" instead.

Once all the changes are made, build the database source code and test::

. peg build
psql

Base directory detection

The entire peg directory tree is based in a directory recommended to be
named pgwork. If you use another directory, you can make the script use
it by setting the PGWORK environment variable. The sequence searched to find
a working area is:

  1. The value passed for PGWORK
  2. The current directory
  3. $HOME/pgwork
  4. $HOME

peg assumes it found a correct working area when there is a "repo"
subdirectory in one of these locations.

Command summary

The following subcommands are accepted by peg:

  • status: Report on environment variables that would be in place if you were
    to execute a command. Useful mainly for troubleshooting.
  • init: Create a repo and a project based on it, if one is named
  • update: Update your repo and a project based on it, if one is named
  • build-only: Execute the build steps and create a database cluster, but
    don't start it. This is typically for if you know you need to modify the
    database configuration before you start it.
  • build: Build binaries, install them, create a cluster, start the database
  • rebuild: Rebuild and install just the main binaries for the server in
    the src/backend directory. When making changes to just the core server
    code, this can save time over doing a full build.
  • initdb: Create a cluster
  • switch: Switch to an existing built binary set and cluster
  • start: Start a cluster
  • stop: Stop a cluster
  • rm: Remove all data from a project (but not the repo)

Environment variable reference

You can see the main environment variables peg uses internally with::

peg status

All of those values are set automatically only if you don't explicitly set
them to something first. This allows you to do things like use peg to
manage your source and binary builds, while still having a PGDATA that
points to a separate location where you want your database to go.

  • PGPORT: Client programs use this to determine what port to connect on;
    if set, any "peg start" commands will start the server on that port. See
    the multi-cluster example for how this might be used.
  • PGVCS: Valid options are "cvs", "git", and "tgz". If you have more
    than one type of source in your repo directory, you can
    use this to specify which of them you should use.
  • PGWORK: Base directory for working area. See "Base directory detection".
  • PGPROJECT: If this is set, it will become the project name used
    for all commands, regardless of what's passed on the command line.
  • PGDEBUG: By default, peg builds PostgreSQL with the standard flags
    you'd want to use for development and testing.
  • PGMAKE: Program to run GNU make. This defaults to "make" but can be
    overridden.
  • PGVERSION: Stable version of PostgreSQL to checkout, when a new project
    is created with "peg init". If not specified, the "master" branch of the
    repository is used. Only useful if peg is running against a git repo.
    This will accept either standard version numbers like "9.1", or version
    names that match the actual stable branch naming conventions like "9_1".

Solaris Use

The defaults for peg are known to have issues building on a typical Solaris
system, where the GNU building toolchain is not necessarily the default one.
Here's a sample configuration you can put into your environment to change
the defaults to work on that platform, assuming you've installed the
Sun Freeware GNU tools in their default location::

export PGMAKE="/usr/sfw/bin/gmake"
export PGDEBUG="--enable-cassert --enable-debug --without-readline --with-libedit-preferred"

Known Issues

See TODO notes in the peg source code (and even this documentation) for the
open issues in the code. A few of these turn into functional issues you
should be aware of.

git Branching and Cleanup

When using git, peg links all projects to a single git directory, with each
project treated as a branch. The program expects that you'll manage
complicated operations here on your own, rather than trying to force git
changes that can potentially be destructive. One area this can cause
many problems is if you're trying to switch to a new origin branch,
such as when using the PGVERSION variable. Moving from the default branch
(origin/master) to another version, or the reverse, will usually require
some manual cleanup of the git checkout before running "peg init".

You always want to be careful to commit any working code to your active
branch before trying to change to a new project, and therefore a new git
branch. Checking the status of the repository checkout is a good habit to
adopt before running "peg init" to try and create a new project.

You can check if your git checkout is completely cleaned up--and therefore
able to accept a branch change with complain--by seeing if its status looks
like this::

$ git status
# On branch master
nothing to commit (working directory clean)

If you see modified or untracked files there, a checkout that tries to change
origin branch version is unlikely to work. A typical error if cleanup isn't
done correctly before "peg init" is many "needs merge" warnings ending
with::

error: you need to resolve your current index first

Most files that cause this sort of problem can be cleaned up by going into
src directory--which is just a link to the repo directory when using peg with
git, and you can go there instead--and executing::

git reset --hard

This will not remove new files that have been added though, which can still
cause you issues. For example, if the same file names have also been used
in the new branch you're checking out, this will cause you some trouble.
One common way developer checkouts can get this sort of "Untracked files"
is if you build ctags to help navigate the source code. All new files and
directories can be removed with::

git clean -f -d

It's very important to move any important files you've added out of the git
directory tree before using "git clean" like this. It will wipe everything
other than the expected repository files out.

Serious problems

So far these are serious only in the sense that you are likely to run into
them and the problems they cause are annoying. But the workarounds to avoid
each are pretty simple.

  • If you are running against a project, then create a new one, it's quite
    easy to get into a state where environment variables and other information
    set by the old project continue to linger around. If you're using a git
    repo for the code, this is particularly likely to happen because
    switching projects only switches branches in the single shared checkout
    of the repo. That doesn't remove the parts of the source code build
    configuration that refer to the old project: the configure stage
    saves where the binaries are going to be stored at for example.
    The suggested workflow when using git is therefore::

    stop
    peg clean
    peg init newproject
    [start a new terminal session to clear all environment variables]
    peg build
    . peg switch

  • peg has a notion that you might set PGDATA directly, rather than want that
    particular directory structure to be in the same PGWORK area everything
    else is at. And when you source peg into your environment to use
    a project, this sets PGDATA. This combination causes a major issue
    when switching projects that are in fact both hosted in the PGWORK
    structure. You'll get the PGDATA from the original project, and the
    one you're switching to will believe that's a manually set PGDATA it
    should use. So everything else will switch to the new project,
    except the database directory, which is confusing. This problem
    will eventually be addressed in the code. To work around
    it for now, before doing "peg switch" you should erase PGDATA (and PGLOG,
    which suffers from the same issue)::

    unset PGDATA PGLOG

Trivial bugs

  • peg creates a database matching your name, which is what psql wants for a
    default. It doesn't check whether it already exist first though, so you'll
    often see an error about that when starting a database. This is harmless.
  • If you source peg output repeatedly, it will pollute your PATH with
    multiple pointers to the same directory tree. This is mostly harmless, just
    slowing down how fast commands can be found in your PATH a bit.

Documentation

The documentation README.rst for the program is in ReST markup. Tools
that operate on ReST can be used to make versions of it formatted
for other purposes, such as rst2html to make a HTML version.

Contact

The project is hosted at http://github.com/gregs1104/peg

If you have any hints, changes or improvements, please contact:

Credits

Copyright (c) 2009-2013, Gregory Smith
All rights reserved.
See COPYRIGHT file for full license details

peg was written by Greg Smith to make all of the PostgreSQL systems he
works on regularly have something closer to a common UI at the console.
peg's directory layout and general design was inspired by several members
of the PostgreSQL community, including:

  • Heikki Linnakangas, whose outlined his personal work habits for
    interacting with the CVS repo and inspired Greg to
    write the original "CVS+rsync Solutions" section of
    http://wiki.postgresql.org/wiki/Working_with_CVS
  • Alan Li, Jeff Davis, and other members of Truviso who I may not
    directly remember borrowing ideas from as much as those two. Alan
    and Jeff both had their own way to organize PostgreSQL installations
    in their respective home directories that I found interesting when we
    worked together on projects.