:: RootR ::  Hosting Order Map Login   Secure Inter-Network Operations  
 
goaccess(1) - phpMan

Command: man perldoc info search(apropos)  


goaccess(1)                                User Manuals                               goaccess(1)



NAME
       goaccess - fast web log analyzer and interactive viewer.

SYNOPSIS
       goaccess [-f input-file][-c][-r][-d][-m][-q][-o][-h][...]

DESCRIPTION
       goaccess  is a free (GPL) real-time web log analyzer and interactive viewer that runs in a
       terminal in *nix systems. It provides fast and valuable HTTP statistics for system  admin‐
       istrators  that  require  a visual server report on the fly. GoAccess parses the specified
       web log file and outputs the data to the X terminal.  Features include:


       General Statistics:
              Number of valid requests, number of invalid requests, time  to  analyze  the  data,
              unique  visitors,  unique  requested files, unique static files (css, ico, jpg, js,
              swf, gif, png) unique HTTP referrers (URLs), unique 404s (not found), size  of  the
              parsed log file, bandwidth consumption.

       Unique visitors:
              HTTP  requests  having  the  same IP, same date and same agent will be considered a
              unique visit. This includes crawlers.

       Requested files
              Hit totals are based on total requests. This module  will  display  hits,  percent,
              bandwidth [time served], [protocol] and [method].

       Requested static files
              Hit  totals are based on total requests. Includes files such as: jpg, css, swf, js,
              gif, png etc. This module will display hits,  percent,  bandwidth,  [time  served],
              [protocol] and [method].

       404 or Not Found
              Hit  totals  are  based  on total requests. This module will display hits, percent,
              bandwidth, [time served], [protocol] and [method].

       Hosts  Hit totals are based on total requests. This module  will  display  hits,  percent,
              [bandwidth, time served]. The expanded module can display extra information such as
              reverse DNS and country. If -a is enabled, a list of user agents will be  displayed
              by selecting the IP and hitting the return key.

       Operating Systems
              Hit totals are based on unique visitors. This module will display hits and percent.
              The expanded module shows all available versions of the parent node.

       Browsers
              Hit totals are based on unique visitors. This module will display hits and percent.
              The expanded module shows all available versions of the parent node.

       Referrers URLs
              The  URL where the request came from. Hit totals are based on total requests.  This
              module will display hits and percent.

       Referring Sites
              This module will display only the host but not the whole URL.  The  URL  where  the
              request came from. Hit totals are based on total requests. This module will display
              hits and percent.

       Keyphrases
              This module will report keyphrases used on Google search, Google cache, and  Google
              translate.  Hit  totals  are based on total requests. This module will display hits
              and percent.

       Geo Location
              Determines where an IP address is geographically located. It outputs the  continent
              and  country.  If  it's unable to determine the country, location will be marked as
              unknown.

       HTTP Status Codes
              The values of the numeric status code to HTTP requests. Hit  totals  are  based  on
              total requests. This module will display hits and percent.

STORAGE
       There  are  three storage options that can be used with GoAccess. Choosing one will depend
       on your environment and needs.

       GLib Hash Tables
              By default GoAccess uses GLib Hash Tables. If your dataset can fit in memory,  then
              this  will  perform  fine. It has average memory usage and pretty good performance.
              For better performance with memory trade-off see Tokyo Cabinet on-memory hash data‐
              base.

       Tokyo Cabinet On-Disk B+ Tree
              Use  this storage method for large datasets where is not possible to fit everything
              in memory. The B+ tree database is slower than any of the hash databases  since  it
              has  to  hit the disk. However, using an SSD greatly increases the performance. You
              may also use this storage method if you need data persistence to quickly load  sta‐
              tistics at a later date.

       Tokyo Cabinet On-Memory Hash Database
              Although  this  may  vary  across  different systems, in general the on-memory hash
              database should perform slightly better than GLib Hash Tables.

CONFIGURATION
       Multiple options can be used to configure GoAccess. For a complete up-to-date list of con‐
       figure options, run ./configure --help

       --enable-debug
              Compile with debugging symbols and turn off compiler optimizations.

       --enable-utf8
              Compile with wide character support. Ncursesw is required.

       --enable-geoip
              Compile with GeoLocation support. MaxMind's GeoIP is required.

       --enable-tcb=<memhash|btree>
              Compile  with  Tokyo Cabinet storage support.  memhash will utilize Tokyo Cabinet's
              on-memory hash database.  btree will utilize Tokyo Cabinet's on-disk B+ Tree  data‐
              base.

       --disable-zlib
              Disable zlib compression on B+ Tree database.

       --disable-bzip
              Disable bzip2 compression on B+ Tree database.

OPTIONS
       The following options can be supplied via the command line or the long options through the
       configuration file.

       --date-format=<dateformat>
              The date_format variable followed by a space, specifies the log  format  date  con‐
              taining  any combination of regular characters and special format specifiers.  They
              all begin with a percentage (%) sign. See `man strftime`.

              Note that there is no need to use time specifiers since they are not used by  GoAc‐
              cess. It's recommended to use only date specifiers, i.e., %Y-%m-%d.

       --log-format=<logformat>
              The  log_format variable followed by a space or \t for tab-delimited, specifies the
              log format string.

              Note that if there are spaces within the format, the string needs to be enclosed in
              double quotes. Inner quotes need to be escaped.

       -c --config-dialog
              Prompt log/date configuration window on program start.

       --color-scheme<1|2>
              Choose  among  color  schemes.   1  for  the  default grey scheme.  2 for the green
              scheme.

       --no-color
              Turn off colored output. This is the  default output on terminals that do not  sup‐
              port colors.

       -f --log-file=<logfile>
              Specify  the  path  to  the input log file. If set in the config file, it will take
              priority over -f from the command line.

       --debug-file=<debugfile>
              Send all debug messages  to  the  specified  file.  Needs  to  be  configured  with
              --enable-debug

       --config-file=<configfile>
              Specify  a custom configuration file to use. If set, it will take priority over the
              global configuration file (if any).

       --no-global-config
              Do not load the global  configuration  file.  This  directory  should  normally  be
              /usr/local/etc, unless specified with --sysconfdir=/dir.

       -e --exclude-ip=<IP|IP-range>
              Exclude one or multiple IPv4/6, includes IP ranges. i.e., 192.168.0.1-192.168.0.10

       -a --agent-list
              Enable a list of user-agents by host. For faster parsing, do not enable this flag.

       -M --http-method
              Include HTTP request method if found. This will create a request key containing the
              request method + the actual request.

       -H --http-protocol
              Include HTTP request protocol if found. This will create a request  key  containing
              the request protocol + the actual request.

       -q --no-query-string
              Ignore    request's   query   string.   i.e.,    www.google.com/page.htm?query   =>
              www.google.com/page.htm

       -r --no-term-resolver
              Disable IP resolver on terminal output.

       -o --output-format=<json|csv>
              Write output to stdout given one of the following formats:  csv  :  Comma-separated
              values (CSV) json : JSON (JavaScript Object Notation)

       --real-os
              Display real OS names. e.g, Windows XP, Snow Leopard.

       --static-file=<extension>
              Add static file extension. e.g.: .mp3 Extensions are case sensitive.

       --ignore-crawlers
              Ignore crawlers.

       --no-progress
              Disable progress metrics [total requests/requests per second].

       -m --with-mouse
              Enable mouse support on main dashboard.

       -d --with-output-resolver
              Enable IP resolver on HTML|JSON output.

       -g --std-geoip
              Standard GeoIP database for less memory usage.

       --geoip-city-data=<geocityfile>
              Specify  path  to GeoIP City database file. i.e., GeoLiteCity.dat. File needs to be
              downloaded from maxmind.com.

       --keep-db-files
              Persist parsed data into disk. This should be set to the first dataset prior to use
              `load-from-disk`.  Setting  it to false will delete all database files when exiting
              the program.

              Only if configured with --enable-tcb=btree

       --load-from-disk
              Load previously stored data from disk. Database files need to exist.  See  keep-db-
              files.

              Only if configured with --enable-tcb=btree

       --db-path=<dir>
              Path  where  the  on-disk  database files are stored. The default value is the /tmp
              directory.

              Only if configured with --enable-tcb=btree

       --xmmap=<num>
              Set the size in bytes of the extra mapped memory. The default value is 0.

              Only if configured with --enable-tcb=btree

       --cache-lcnum=<num>
              Specifies the maximum number of leaf nodes to be cached. If it is not more than  0,
              the  default  value is specified. The default value is 1024. Setting a larger value
              will increase speed performance, however, memory consumption will  increase.  Lower
              value will decrease memory consumption.

              Only if configured with --enable-tcb=btree

       --cache-ncnum=<num>
              Specifies the maximum number of non-leaf nodes to be cached. If it is not more than
              0, the default value is specified. The default value is 512.

              Only if configured with --enable-tcb=btree

       --tune-lmemb=<num>
              Specifies the number of members in each leaf page. If it is not more  than  0,  the
              default value is specified. The default value is 128.

              Only if configured with --enable-tcb=btree

       --tune-nmemb=<num>
              Specifies  the  number  of members in each non-leaf page. If it is not more than 0,
              the default value is specified. The default value is 256.

              Only if configured with --enable-tcb=btree

       --tune-bnum=<num>
              Specifies the number of elements of the bucket array. If it is not more than 0, the
              default  value  is  specified.  The  default  value is 32749. Suggested size of the
              bucket array is about from 1 to 4 times of the number of all pages to be stored.

              Only if configured with --enable-tcb=btree

       --compression=<zlib|bz2>
              Specifies that each page is compressed with ZLIB|BZ2 encoding.

              Only if configured with --enable-tcb=btree

       -h --help
              The help.

       -V --version
              Display version information and exit.

       -s --storage
              Display current storage method. i.e., B+ Tree, Hash.

CUSTOM LOG/DATE FORMAT
       GoAccess can parse virtually any web log format.

       Predefined options include, Common  Log  Format  (CLF),  Combined  Log  Format  (XLF/ELF),
       including virtual host, Amazon CloudFront (Download Distribution) and W3C format (IIS).

       GoAccess allows any custom format string as well.

       There are two ways to configure the log format.  The easiest is to run GoAccess with -c to
       prompt a configuration window. Otherwise, it can be configured under ~/.goaccessrc.

       date_format
              The date_format variable followed by a space, specifies the log  format  date  con‐
              taining  any  combination of regular characters and special format specifiers. They
              all begin with a percentage (%) sign. See http://linux.die.net/man/3/strftime

              Note that there is no need to use time specifiers since they are not used by  GoAc‐
              cess. It's recommended to use only date specifiers, i.e., %Y-%m-%d.

       log_format
              The  log_format  variable  followed  by  a  space  or \t , specifies the log format
              string.

       %d     date field matching the date_format variable.

       %h     host (the client IP address, either IPv4 or IPv6)

       %r     The request line from the client. This  requires  specific  delimiters  around  the
              request (as single quotes, double quotes, or anything else) to be parsable. If not,
              we have to use a combination of special format specifiers as %m %U %H.

       %m     The request method.

       %U     The URL path requested (including any query string).

       %H     The request protocol.

       %s     The status code that the server sends back to the client.

       %b     The size of the object returned to the client.

       %R     The "Referer" HTTP request header.

       %u     The user-agent HTTP request header.

       %D     The time taken to serve the request, in microseconds.

       %T     The time taken to serve the request, in seconds or  milliseconds.   Note:  %D  will
              take priority over %T if both are used.

       %^     Ignore this field.

       GoAccess requires the following fields:

              %h a valid IPv4/6

              %d a valid date

              %s server status code

              %r the request

INTERACTIVE MENU
       F1 or h
              Main help.

       F5     Redraw main window.

       q      Quit the program, current window or collapse active module

       o or  ENTER
              Expand selected module or open window

       0-9 and Shift + 0
              Set selected module to active

       j      Scroll down within expanded module

       k      Scroll up within expanded module

       c      Set or change scheme color.

       TAB    Forward iteration of modules. Starts from current active module.

       SHIFT + TAB
              Backward iteration of modules. Starts from current active module.

       ^ f    Scroll forward one screen within an active module.

       ^ b    Scroll backward one screen within an active module.

       s      Sort options for active module

       /      Search across all modules (regex allowed)

       n      Find the position of the next occurrence across all modules.

       g      Move to the first item or top of screen.

       G      Move to the last item or bottom of screen.

EXAMPLES
       The simplest and fastest usage would be:

              # goaccess -f access.log

       That will generate an interactive text-only output.

       To generate full statistics we can run GoAccess as:

              # goaccess -f access.log -a

       To generate an HTML report:

              # goaccess -f access.log -a > report.html

       To generate a JSON file:

              # goaccess -f access.log -a -d -o json > report.json

       To generate a CSV file:

              # goaccess -f access.log -o csv > report.csv

       The -a flag indicates that we want to process an agent-list for every host parsed.

       The  -d  flag  indicates that we want to enable the IP resolver on the HTML | JSON output.
       (It will take longer time to output since it has to resolve all queries.)

       The -c flag will prompt the date and log format configuration window. Only when curses  is
       initialized.

       Now  if  we  want  to add more flexibility to GoAccess, we can do a series of  pipes.  For
       instance:

       If we would like to process all access.log.*.gz we can do:

              #  zcat access.log.*.gz | goaccess

       OR

              #  zcat -f access.log* | goaccess

       Another useful pipe would be filtering dates out of the web log

       The following will get all HTTP requests starting on 05/Dec/2010  until  the  end  of  the
       file.

              # sed -n '/05\/Dec\/2010/,$ p' access.log | goaccess -a

       If we want to parse only a certain time-frame from DATE a to DATE b, we can do:

              sed -n '/5\/Nov\/2010/,/5\/Dec\/2010/ p' access.log | goaccess -a

       Note that this could take longer time to parse depending on the speed of sed.

       To exclude a list of virtual hosts you can do the following:

              grep -v "`cat exclude_vhost_list_file`" vhost_access.log | goaccess

       Also,  it  is worth pointing out that if we want to run GoAccess at lower priority, we can
       run it as:

              # nice -n 19 goaccess -f access.log -a

       and if you don't want to install it on your server, you can still run it from  your  local
       machine:

              # ssh root@server 'cat /var/log/apache2/access.log' | goaccess -a

NOTES
       For  now,  each active window has a total of 300 items. Eventually this will be customiza‐
       ble.

       Piping a log to GoAccess will disable the real-time functionality.  This  is  due  to  the
       portability  issue  on  determining  the  actual  size of STDIN. However, a future release
       *might* include this feature.

BUGS
       If you think you have found a bug, please send me an email to goaccess AT prosoftcorp.com  or
       use the issue tracker in https://github.com/allinurl/goaccess/issues

AUTHOR
       Gerardo  Orellana  <goaccess AT prosoftcorp.com>  For more details about it, or new releases,
       please visit http://goaccess.prosoftcorp.com



Linux                                       JULY 2014                                 goaccess(1)


/man
rootr.net - man pages