Setup Ghiro¶
Ghiro is supposed to run on a GNU/Linux native system. For the purpose of this documentation, we choose latest Ubuntu LTS Server as reference system for the commands examples, although Ghiro works on any GNU/Linux distribution. Probably Ghiro could work on other systems like MacOSX but this is not tested and out of scope of this documentation.
Requirements¶
Ghiro has the following requirements:
- MongoDB: you need to run a MongoDB database (at least release 2.0)
- Python: that’s how we roll (it is required Python 2.7, although Ghiro is written to work also in Python 3 some third party libraries aren’t)
- Python-magic: for MIME extraction
- Python 2.x bindings for gobject-introspection libraries, required by Gexiv2
- Gexiv2: for metadata extraction (at least release 0.6.1)
- Pillow (Python Imaging library - PIL fork): for image manipulation
- Python-dateutil: for datetime manipulation
- Pymongo: driver for MongoDB (at least release 2.5)
- Django: for web interface (at least release 1.5, suggested django 1.6.x)
- Chardet: for text encoding detection
- Pdfkit: used for PDF report generation (at least release 0.4)
- Wkhtmltopdf: used by pdfkit
- Requests: used for HTTP requests
- NudePy: used for nude detection
- ImageHash: to calculate perceptual image hash
If you choose MySQL or PostgreSQL as database you have to install their additional drivers.
Ghiro web application is tested and working on the following browsers:
- Internet Explorer 11
- Mozilla Firefox starting from 35
- Google Chrome starting from 39
- Opera starting from 26
- Safari 8 and 9
- IOS 7 for Ipad and Iphone
Getting started¶
Download and extract¶
Download Ghiro as explained in this documentation, if you download the stable package extract it. Enter in the Ghiro folder.
Preparing¶
If you don’t have already it, install MongoDB with the following command (run as root or with sudo):
apt-get install mongodb
Ghiro works with SQLite although it is strongly suggested to use MySQL or PostgreSQL as database. If SQLite is used, Ghiro will automatically decrease processing pallellism to one because SQLite does not support concurrent operations. Optionally, as an example, you can install MySQL with the following command (run as root or with sudo):
apt-get install mysql-server
Install required libraries with the following commands (run as root or with sudo):
apt-get install python-pip build-essential python-dev python-gi
apt-get install libgexiv2-2 gir1.2-gexiv2-0.10 wkhtmltopdf
apt-get install libtiff5-dev libjpeg-dev zlib1g-dev libfreetype6-dev
apt-get install liblcms2-dev libwebp-dev tcl8.5-dev tk8.5-dev python-tk
The wkhtmltopdf tool used for PDF report generation needs a X server running, if you don’t have one just install XFVB and configure wkhtmltopdf to use it with:
apt-get install xvfb
printf '#!/bin/bash\nxvfb-run --server-args="-screen 0, 1024x768x24" /usr/bin/wkhtmltopdf $*' > /usr/bin/wkhtmltopdf.sh
chmod a+x /usr/bin/wkhtmltopdf.sh
ln -s /usr/bin/wkhtmltopdf.sh /usr/local/bin/wkhtmltopdf
Install updated libraries via pip with the following commands (run as root or with sudo):
pip install -r requirements.txt
Preparing¶
The default databases are SQLite3 and MongoDB (you need to have it listening on localhost). If you need to change this see the configuration chapter below.
First of all you need to create an empty database with the following command (inside Ghiro’s root):
python manage.py migrate
Create a superuser for administration, you should provide an username and a password, use the following command (inside Ghiro’s root):
python manage.py createsuperuser
Running¶
To start the web interface run the following command (inside Ghiro’s root):
python manage.py runserver
A web server running Ghiro will be available on http://127.0.0.1:8000/ If you need to listen expose Ghiro to all addresses or change the port (in this example is 9000) run the following command (inside Ghiro’s root):
python manage.py runserver 0.0.0.0:9000
To start processing images you have to start the processing daemon, run the following command (inside Ghiro’s root):
python manage.py process
Configuration¶
Ghiro works pretty well with default options, which are SQLite3 as relational database and use MongoDB installed and listening on local host. If you want to change any setting the configuration file is located in ghiro/local_settings.py. The default settings will fit all common user needs.
Following is the default ghiro/local_settings.py file:
LOCAL_SETTINGS = True
from .settings import *
DATABASES = {
'default': {
# Engine type. Ends with 'postgresql_psycopg2', 'mysql', 'sqlite3' or 'oracle'.
'ENGINE': 'django.db.backends.sqlite3',
# Database name or path to database file if using sqlite3.
'NAME': 'db.sqlite',
# Credntials. The following settings are not used with sqlite3.
'USER': '',
'PASSWORD': '',
# Empty for localhost through domain sockets or '127.0.0.1' for localhost through TCP.
'HOST': '',
# Set to empty string for default port.
'PORT': '',
# Set timeout (avoids SQLite "database is locked" errors).
'timeout': 300,
# The lifetime of a database persistent connection, in seconds.
"CONN_MAX_AGE": 60,
}
}
# MySQL tuning.
#DATABASE_OPTIONS = {
# "init_command": "SET storage_engine=INNODB",
#}
# Mongo database settings
MONGO_URI = "mongodb://localhost/"
MONGO_DB = "ghirodb"
# Max uploaded image size (in bytes).
# Default is 150MB.
MAX_FILE_UPLOAD = 157286400
# Allowed file types.
ALLOWED_EXT = ['image/bmp', 'image/x-canon-cr2', 'image/jpeg', 'image/png',
'image/x-canon-crw', 'image/x-eps', 'image/x-nikon-nef',
'application/postscript', 'image/gif', 'image/x-minolta-mrw',
'image/x-olympus-orf', 'image/x-photoshop', 'image/x-fuji-raf',
'image/x-panasonic-raw2', 'image/x-tga', 'image/tiff', 'image/pjpeg',
'image/x-x3f', 'image/x-portable-pixmap']
# Override default secret key stored in secret_key.py
# Make this unique, and don't share it with anybody.
# SECRET_KEY = "YOUR_RANDOM_KEY"
# Language code for this installation. All choices can be found here:
# http://www.i18nguy.com/unicode/language-identifiers.html
LANGUAGE_CODE = "en-us"
ADMINS = (
# ("Your Name", "your_email@example.com"),
)
MANAGERS = ADMINS
# Allow verbose debug error message in case of application fault.
# It's strongly suggested to set it to False if you are serving the
# web application from a web server front-end (i.e. Apache).
DEBUG = True
# A list of strings representing the host/domain names that this Django site
# can serve.
# Values in this list can be fully qualified names (e.g. 'www.example.com').
# When DEBUG is True or when running tests, host validation is disabled; any
# host will be accepted. Thus it's usually only necessary to set it in production.
ALLOWED_HOSTS = ["*"]
# Automatically checks once a day for updates.
# Set it to False to disable update check.
UPDATE_CHECK = True
# Auto upload is used to upload ana analyze files from a directory, monitoring
# it for changes.
# It is usually used to upload images via a shared folder or FTP.
# It should be an absolute path.
# Example: "/home/ghiro_share"
AUTO_UPLOAD_DIR = None
# Delete a file after upload and submission.
# The default behaviour is True.
# WARNING: It is not suggested to set it to False, because you will re-submit images
# each startup.
AUTO_UPLOAD_DEL_ORIGINAL = True
# Clean up AUTO_UPLOAD_DIR when startup.
# The default behaviour is True.
# WARNING: It is not suggested to set it to False, because you will re-submit images
# each startup.
AUTO_UPLOAD_STARTUP_CLEANUP = True
# Auditing.
# Logs all user actions.
AUDITING_ENABLED = True
# Log directory. Here is where Ghiro puts all logs.
LOG_DIR = os.path.join(PROJECT_DIR, "log")
# File name used for image processor log.
LOG_PROCESSING_NAME = "processing.log"
# Processor log maximum size.
LOG_PROCESSING_SIZE = 1024*1024*16 # 16 megabytes
# How many copies of processor log keep while rotating logs.
LOG_PROCESSING_NUM = 3 # keep 3 copies
# File name used for audit log.
LOG_AUDIT_NAME = "audit.log"
# Audit log maximum size.
LOG_AUDIT_SIZE = 1024*1024*16 # 16 megabytes
# How many copies of audit log keep while rotating logs.
LOG_AUDIT_NUM = 3 # keep 3 copies
# Enable JSON export to file for analysis results.
# This will create a JSON file for each analysis.
JSON_EXPORT = False
If you change the configuration after the first setup, before editing this file you have to stop both Ghiro’s web interface and processing daemon, you may restart them after the edit.
If you changed any setting related to the database configuration you have to re-build your database with the command (inside Ghiro’s root):
python manage.py syncdb
Logging¶
Ghiro provides several types of logging divided in the following categories:
- Audit logging: tracks all users actions for audit purposes.
- Processing logging: logs all steps of image analysis, this helps debugging stuff.
You can change the behavior of logging editing ghiro/local_settings.py:
- LOG_DIR: log directory. Here is where Ghiro puts all logs.
Audit log¶
The audit log contains all users actions (i.e. case creation, image analysis actions) to keep track of user activity. By default it is located in audit.log in the Ghiro log directory. You can change the behavior of this log editing ghiro/local_settings.py:
- LOG_AUDIT_NAME: audit log file name.
- LOG_AUDIT_SIZE: audit log maximum size.
- LOG_AUDIT_NUM: how many copies of audit log keep while rotating logs.
Processing log¶
The processing log contains image analysis logs, it is of great help when debugging Ghiro or trying to understand what happen under the hood. By default it is located in processing.log in the Ghiro log directory. You can change the behavior of this log editing ghiro/local_settings.py:
- LOG_PROCESSING_NAME: processing log file name.
- LOG_PROCESSING_SIZE: processing log maximum size.
- LOG_PROCESSING_NUM: how many copies of processing log keep while rotating logs.
Running Ghiro as service¶
If you want to run Ghiro as a service you have to get rid of Django web server and run Ghiro inside a web server (i.e. Apache).
Database¶
We do not suggest SQLite3 for production environment, please go for MySQL or PostgreSQL. In this example we are going to show you how to configure Ghiro with MySQL.
Setup MySQL and Python drivers with the following command (run as root or with sudo):
apt-get install mysql-server python-mysqldb
Go through the wizard and set MySQL password. Configure Ghiro to use MySQL as explained in configuration paragraph.
Apache as a front-end¶
Now we are going to configure Apache as a front end for Ghiro’s django application.
Setup Apache and mod_wsgi with the following command (run as root or with sudo):
apt-get install apache2 libapache2-mod-wsgi
An example of virtual host configuration is the following (Ghiro is extracted in /var/www/ghiro/ in this example):
<VirtualHost *:80>
ServerAdmin webmaster@localhost
WSGIProcessGroup ghiro
WSGIDaemonProcess ghiro processes=5 threads=10 user=nobody group=nogroup python-path=/var/www/ghiro/ home=/var/www/ghiro/ display-name=local
WSGIScriptAlias / /var/www/ghiro/ghiro/wsgi.py
Alias /static/ /var/www/ghiro/static/
<Location "/static/">
Options -Indexes
</Location>
ErrorLog ${APACHE_LOG_DIR}/error.log
# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
LogLevel warn
CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>
Restart apache. Now the web application is listening on port 80/tcp, just put the IP address in your browser.
Run the processor with upstart¶
You can automatically run the processor with upstart.
Create the file ghiro.conf in /etc/init/ with the following content:
description "Ghiro"
start on started mysql
stop on shutdown
script
exec start-stop-daemon --start -d /var/www/ghiro \
--exec /usr/bin/python manage.py process
end script
To stop the processor use the following command (run as root or with sudo):
service ghiro stop
To start the processor use the following command (run as root or with sudo):
service ghiro start