Setup Ghiro

Ghiro is supposed to run on a GNU/Linux native system. For the purpose of this documentation, we choose latest Ubuntu LTS Server as reference system for the commands examples, although Ghiro works on any GNU/Linux distribution. Probably Ghiro could work on other systems like MacOSX but this is not tested and out of scope of this documentation.

Requirements

Ghiro has the following requirements:

  • MongoDB: you need to run a MongoDB database (at least release 2.0)
  • Python: that’s how we roll (it is required Python 2.7, although Ghiro is written to work also in Python 3 some third party libraries aren’t)
  • Python-magic: for MIME extraction
  • Python 2.x bindings for gobject-introspection libraries, required by Gexiv2
  • Gexiv2: for metadata extraction (at least release 0.6.1)
  • Pillow (Python Imaging library - PIL fork): for image manipulation
  • Python-dateutil: for datetime manipulation
  • Pymongo: driver for MongoDB (at least release 2.5)
  • Django: for web interface (at least release 1.5, suggested django 1.6.x)
  • Chardet: for text encoding detection
  • Pdfkit: used for PDF report generation (at least release 0.4)
  • Wkhtmltopdf: used by pdfkit
  • Requests: used for HTTP requests
  • NudePy: used for nude detection
  • ImageHash: to calculate perceptual image hash

If you choose MySQL or PostgreSQL as database you have to install their additional drivers.

Ghiro web application is tested and working on the following browsers:

  • Internet Explorer 11
  • Mozilla Firefox starting from 35
  • Google Chrome starting from 39
  • Opera starting from 26
  • Safari 8 and 9
  • IOS 7 for Ipad and Iphone

Getting started

Download and extract

Download Ghiro as explained in this documentation, if you download the stable package extract it. Enter in the Ghiro folder.

Preparing

If you don’t have already it, install MongoDB with the following command (run as root or with sudo):

apt-get install mongodb

Ghiro works with SQLite although it is strongly suggested to use MySQL or PostgreSQL as database. If SQLite is used, Ghiro will automatically decrease processing pallellism to one because SQLite does not support concurrent operations. Optionally, as an example, you can install MySQL with the following command (run as root or with sudo):

apt-get install mysql-server

Install required libraries with the following commands (run as root or with sudo):

apt-get install python-pip build-essential python-dev python-gi
apt-get install libgexiv2-2 gir1.2-gexiv2-0.10 wkhtmltopdf
apt-get install libtiff5-dev libjpeg-dev zlib1g-dev libfreetype6-dev
apt-get install liblcms2-dev libwebp-dev tcl8.5-dev tk8.5-dev python-tk

The wkhtmltopdf tool used for PDF report generation needs a X server running, if you don’t have one just install XFVB and configure wkhtmltopdf to use it with:

apt-get install xvfb
printf '#!/bin/bash\nxvfb-run --server-args="-screen 0, 1024x768x24" /usr/bin/wkhtmltopdf $*' > /usr/bin/wkhtmltopdf.sh
chmod a+x /usr/bin/wkhtmltopdf.sh
ln -s /usr/bin/wkhtmltopdf.sh /usr/local/bin/wkhtmltopdf

Install updated libraries via pip with the following commands (run as root or with sudo):

pip install -r requirements.txt

Preparing

The default databases are SQLite3 and MongoDB (you need to have it listening on localhost). If you need to change this see the configuration chapter below.

First of all you need to create an empty database with the following command (inside Ghiro’s root):

python manage.py migrate

Create a superuser for administration, you should provide an username and a password, use the following command (inside Ghiro’s root):

python manage.py createsuperuser

Running

To start the web interface run the following command (inside Ghiro’s root):

python manage.py runserver

A web server running Ghiro will be available on http://127.0.0.1:8000/ If you need to listen expose Ghiro to all addresses or change the port (in this example is 9000) run the following command (inside Ghiro’s root):

python manage.py runserver 0.0.0.0:9000

To start processing images you have to start the processing daemon, run the following command (inside Ghiro’s root):

python manage.py process

Configuration

Ghiro works pretty well with default options, which are SQLite3 as relational database and use MongoDB installed and listening on local host. If you want to change any setting the configuration file is located in ghiro/local_settings.py. The default settings will fit all common user needs.

Following is the default ghiro/local_settings.py file:

LOCAL_SETTINGS = True
from .settings import *

DATABASES = {
    'default': {
        # Engine type. Ends with 'postgresql_psycopg2', 'mysql', 'sqlite3' or 'oracle'.
        'ENGINE': 'django.db.backends.sqlite3',
        # Database name or path to database file if using sqlite3.
        'NAME': 'db.sqlite',
        # Credntials. The following settings are not used with sqlite3.
        'USER': '',
        'PASSWORD': '',
        # Empty for localhost through domain sockets or '127.0.0.1' for localhost through TCP.
        'HOST': '',
        # Set to empty string for default port.
        'PORT': '',
        # Set timeout (avoids SQLite "database is locked" errors).
        'timeout': 300,
        # The lifetime of a database  persistent connection, in seconds.
        "CONN_MAX_AGE": 60,
    }
}

# MySQL tuning.
#DATABASE_OPTIONS = {
# "init_command": "SET storage_engine=INNODB",
#}

# Mongo database settings
MONGO_URI = "mongodb://localhost/"
MONGO_DB = "ghirodb"

# Max uploaded image size (in bytes).
# Default is 150MB.
MAX_FILE_UPLOAD = 157286400

# Allowed file types.
ALLOWED_EXT = ['image/bmp', 'image/x-canon-cr2', 'image/jpeg', 'image/png',
               'image/x-canon-crw', 'image/x-eps', 'image/x-nikon-nef',
               'application/postscript', 'image/gif', 'image/x-minolta-mrw',
               'image/x-olympus-orf', 'image/x-photoshop', 'image/x-fuji-raf',
               'image/x-panasonic-raw2', 'image/x-tga', 'image/tiff', 'image/pjpeg',
               'image/x-x3f', 'image/x-portable-pixmap']

# Override default secret key stored in secret_key.py
# Make this unique, and don't share it with anybody.
# SECRET_KEY = "YOUR_RANDOM_KEY"

# Language code for this installation. All choices can be found here:
# http://www.i18nguy.com/unicode/language-identifiers.html
LANGUAGE_CODE = "en-us"

ADMINS = (
    # ("Your Name", "your_email@example.com"),
)

MANAGERS = ADMINS

# Allow verbose debug error message in case of application fault.
# It's strongly suggested to set it to False if you are serving the
# web application from a web server front-end (i.e. Apache).
DEBUG = True

# A list of strings representing the host/domain names that this Django site
# can serve.
# Values in this list can be fully qualified names (e.g. 'www.example.com').
# When DEBUG is True or when running tests, host validation is disabled; any
# host will be accepted. Thus it's usually only necessary to set it in production.
ALLOWED_HOSTS = ["*"]

# Automatically checks once a day for updates.
# Set it to False to disable update check.
UPDATE_CHECK = True

# Auto upload is used to upload ana analyze files from a directory, monitoring
# it for changes.
# It is usually used to upload images via a shared folder or FTP.
# It should be an absolute path.
# Example: "/home/ghiro_share"
AUTO_UPLOAD_DIR = None
# Delete a file after upload and submission.
# The default behaviour is True.
# WARNING: It is not suggested to set it to False, because you will re-submit images
# each startup.
AUTO_UPLOAD_DEL_ORIGINAL = True
# Clean up AUTO_UPLOAD_DIR when startup.
# The default behaviour is True.
# WARNING: It is not suggested to set it to False, because you will re-submit images
# each startup.
AUTO_UPLOAD_STARTUP_CLEANUP = True

# Auditing.
# Logs all user actions.
AUDITING_ENABLED = True

# Log directory. Here is where Ghiro puts all logs.
LOG_DIR = os.path.join(PROJECT_DIR, "log")
# File name used for image processor log.
LOG_PROCESSING_NAME = "processing.log"
# Processor log maximum size.
LOG_PROCESSING_SIZE = 1024*1024*16 # 16 megabytes
# How many copies of processor log keep while rotating logs.
LOG_PROCESSING_NUM = 3 # keep 3 copies
# File name used for audit log.
LOG_AUDIT_NAME = "audit.log"
# Audit log maximum size.
LOG_AUDIT_SIZE = 1024*1024*16 # 16 megabytes
# How many copies of audit log keep while rotating logs.
LOG_AUDIT_NUM = 3 # keep 3 copies

# Enable JSON export to file for analysis results.
# This will create a JSON file for each analysis.
JSON_EXPORT = False

If you change the configuration after the first setup, before editing this file you have to stop both Ghiro’s web interface and processing daemon, you may restart them after the edit.

If you changed any setting related to the database configuration you have to re-build your database with the command (inside Ghiro’s root):

python manage.py syncdb

Logging

Ghiro provides several types of logging divided in the following categories:

  • Audit logging: tracks all users actions for audit purposes.
  • Processing logging: logs all steps of image analysis, this helps debugging stuff.

You can change the behavior of logging editing ghiro/local_settings.py:

  • LOG_DIR: log directory. Here is where Ghiro puts all logs.

Audit log

The audit log contains all users actions (i.e. case creation, image analysis actions) to keep track of user activity. By default it is located in audit.log in the Ghiro log directory. You can change the behavior of this log editing ghiro/local_settings.py:

  • LOG_AUDIT_NAME: audit log file name.
  • LOG_AUDIT_SIZE: audit log maximum size.
  • LOG_AUDIT_NUM: how many copies of audit log keep while rotating logs.

Processing log

The processing log contains image analysis logs, it is of great help when debugging Ghiro or trying to understand what happen under the hood. By default it is located in processing.log in the Ghiro log directory. You can change the behavior of this log editing ghiro/local_settings.py:

  • LOG_PROCESSING_NAME: processing log file name.
  • LOG_PROCESSING_SIZE: processing log maximum size.
  • LOG_PROCESSING_NUM: how many copies of processing log keep while rotating logs.

Running Ghiro as service

If you want to run Ghiro as a service you have to get rid of Django web server and run Ghiro inside a web server (i.e. Apache).

Database

We do not suggest SQLite3 for production environment, please go for MySQL or PostgreSQL. In this example we are going to show you how to configure Ghiro with MySQL.

Setup MySQL and Python drivers with the following command (run as root or with sudo):

apt-get install mysql-server python-mysqldb

Go through the wizard and set MySQL password. Configure Ghiro to use MySQL as explained in configuration paragraph.

Apache as a front-end

Now we are going to configure Apache as a front end for Ghiro’s django application.

Setup Apache and mod_wsgi with the following command (run as root or with sudo):

apt-get install apache2 libapache2-mod-wsgi

An example of virtual host configuration is the following (Ghiro is extracted in /var/www/ghiro/ in this example):

<VirtualHost *:80>
    ServerAdmin webmaster@localhost
    WSGIProcessGroup ghiro
    WSGIDaemonProcess ghiro processes=5 threads=10 user=nobody group=nogroup python-path=/var/www/ghiro/ home=/var/www/ghiro/ display-name=local
    WSGIScriptAlias / /var/www/ghiro/ghiro/wsgi.py
    Alias /static/ /var/www/ghiro/static/
    <Location "/static/">
        Options -Indexes
    </Location>

    ErrorLog ${APACHE_LOG_DIR}/error.log

    # Possible values include: debug, info, notice, warn, error, crit,
    # alert, emerg.
    LogLevel warn

    CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>

Restart apache. Now the web application is listening on port 80/tcp, just put the IP address in your browser.

Run the processor with upstart

You can automatically run the processor with upstart.

Create the file ghiro.conf in /etc/init/ with the following content:

description     "Ghiro"

start on started mysql
stop on shutdown
script
    exec start-stop-daemon --start -d /var/www/ghiro \
        --exec /usr/bin/python manage.py process
end script

To stop the processor use the following command (run as root or with sudo):

service ghiro stop

To start the processor use the following command (run as root or with sudo):

service ghiro start