plaso.engine package

Submodules

plaso.engine.artifact_filters module

Helper to create filters based on forensic artifact definitions.

class plaso.engine.artifact_filters.ArtifactDefinitionsFiltersHelper(artifacts_registry)[source]

Bases: object

Helper to create collection filters based on artifact definitions.

Builds collection filters from forensic artifact definitions.

For more information about Forensic Artifacts see: https://github.com/ForensicArtifacts/artifacts/blob/main/docs/Artifacts%20definition%20format%20and%20style%20guide.asciidoc

file_system_artifact_names

names of artifacts definitions that generated file system find specifications.

Type:

set[str]

file_system_find_specs

file system find specifications of paths to include in the collection.

Type:

list[dfvfs.FindSpec]

registry_artifact_names

names of artifacts definitions that generated Windows Registry find specifications.

Type:

set[str]

registry_find_specs

Windows Registry find specifications.

Type:

list[dfwinreg.FindSpec]

BuildFindSpecs(artifact_filter_names, environment_variables=None, user_accounts=None)[source]

Builds find specifications from artifact definitions.

Parameters:
  • artifact_filter_names (list[str]) – names of artifact definitions that are used for filtering file system and Windows Registry key paths.

  • environment_variables (list[EnvironmentVariableArtifact]) – environment variables.

  • user_accounts (Optional[list[UserAccountArtifact]]) – user accounts.

classmethod CheckKeyCompatibility(key_path)[source]

Checks if a Windows Registry key path is supported by dfWinReg.

Parameters:

key_path (str) – path of the Windows Registry key.

Returns:

True if key is compatible or False if not.

Return type:

bool

__init__(artifacts_registry)[source]

Initializes an artifact definitions filters helper.

Parameters:

artifacts_registry (artifacts.ArtifactDefinitionsRegistry) – artifact definitions registry.

plaso.engine.configurations module

Processing configuration classes.

class plaso.engine.configurations.CredentialConfiguration(*args: Any, **kwargs: Any)[source]

Bases: AttributeContainer

Configuration settings for a credential.

credential_data

credential data.

Type:

bytes

credential_type

credential type.

Type:

str

path_spec

path specification.

Type:

dfvfs.PathSpec

CONTAINER_TYPE = 'credential_configuration'
__init__(credential_data=None, credential_type=None, path_spec=None)[source]

Initializes a credential configuration object.

Parameters:
  • credential_data (Optional[bytes]) – credential data.

  • credential_type (Optional[str]) – credential type.

  • path_spec (Optional[dfvfs.PathSpec]) – path specification.

class plaso.engine.configurations.EventExtractionConfiguration(*args: Any, **kwargs: Any)[source]

Bases: AttributeContainer

Configuration settings for event extraction.

These settings are primarily used by the parser mediator.

filter_object

filter that specifies which events to include.

Type:

objectfilter.Filter

CONTAINER_TYPE = 'event_extraction_configuration'
__init__()[source]

Initializes an event extraction configuration object.

class plaso.engine.configurations.ExtractionConfiguration(*args: Any, **kwargs: Any)[source]

Bases: AttributeContainer

Configuration settings for extraction.

These settings are primarily used by the extraction worker.

archive_types_string

comma separated archive types for which embedded file entries should be processed.

Type:

str

extract_winevt_resources

True if Windows EventLog resources should be extracted.

Type:

bool

extract_winreg_binary

True if Windows Registry binary values should be extracted.

Type:

bool

hasher_file_size_limit

maximum file size that hashers should process, where 0 or None represents unlimited.

Type:

int

hasher_names_string

comma separated names of hashers to use during processing.

Type:

str

process_compressed_streams

True if file content in compressed streams should be processed.

Type:

bool

yara_rules_string

Yara rule definitions.

Type:

str

CONTAINER_TYPE = 'extraction_configuration'
__init__()[source]

Initializes an extraction configuration object.

class plaso.engine.configurations.ProcessingConfiguration(*args: Any, **kwargs: Any)[source]

Bases: AttributeContainer

Configuration settings for processing.

artifact_definitions_path

path to artifact definitions directory or file.

Type:

str

artifact_filters

names of artifact definitions that are used for filtering file system and Windows Registry key paths.

Type:

Optional list[str]

credentials

credential configurations.

Type:

list[CredentialConfiguration]

custom_artifacts_path

path to custom artifact definitions directory or file.

Type:

str

custom_formatters_path

path to custom formatter definitions file.

Type:

str

data_location

path to the data files.

Type:

str

debug_output

True if debug output should be enabled.

Type:

bool

dynamic_time

True if date and time values should be represented in their granularity or semantically.

Type:

bool

event_extraction

event extraction configuration.

Type:

EventExtractionConfiguration

extraction

extraction configuration.

Type:

ExtractionConfiguration

filter_file

path to a file with find specifications.

Type:

str

force_parser

True if a specified parser should be forced to be used to extract events.

Type:

bool

log_filename

name of the log file.

Type:

str

parser_filter_expression

parser filter expression, where None represents all parsers and plugins.

Type:

str

preferred_codepage

preferred codepage.

Type:

str

preferred_encoding

preferred output encoding.

Type:

str

preferred_language

preferred language.

Type:

str

preferred_time_zone

preferred time zone.

Type:

str

preferred_year

preferred initial year value for year-less date and time values.

Type:

int

profiling

profiling configuration.

Type:

ProfilingConfiguration

task_storage_format

format to use for storing task results.

Type:

str

task_storage_path

path of the directory containing SQLite task storage files.

Type:

str

temporary_directory

path of the directory for temporary files.

Type:

str

CONTAINER_TYPE = 'processing_configuration'
__init__()[source]

Initializes a process configuration object.

class plaso.engine.configurations.ProfilingConfiguration(*args: Any, **kwargs: Any)[source]

Bases: AttributeContainer

Configuration settings for profiling.

directory

path to the directory where the profiling sample files should be stored.

Type:

str

profilers

names of the profilers to enable. Supported profilers are:

  • ‘format_checks’, which profiles CPU time consumed per format check;

  • ‘memory’, which profiles memory usage;

  • ‘parsers’, which profiles CPU time consumed by individual parsers;

  • ‘processing’, which profiles CPU time consumed by different parts of processing;

  • ‘serializers’, which profiles CPU time consumed by individual serializers.

  • ‘storage’, which profiles storage reads and writes.

Type:

set(str)

sample_rate

the profiling sample rate. Contains the number of event sources processed.

Type:

int

CONTAINER_TYPE = 'profiling_configuration'
HaveProfileAnalyzers()[source]

Determines if analyzers profiling is configured.

Returns:

True if analyzers profiling is configured.

Return type:

bool

HaveProfileFormatChecks()[source]

Determines if format checks profiling is configured.

Returns:

True if format checks profiling is configured.

Return type:

bool

HaveProfileMemory()[source]

Determines if memory profiling is configured.

Returns:

True if memory profiling is configured.

Return type:

bool

HaveProfileParsers()[source]

Determines if parsers profiling is configured.

Returns:

True if parsers profiling is configured.

Return type:

bool

HaveProfileProcessing()[source]

Determines if processing profiling is configured.

Returns:

True if processing profiling is configured.

Return type:

bool

HaveProfileSerializers()[source]

Determines if serializers profiling is configured.

Returns:

True if serializers profiling is configured.

Return type:

bool

HaveProfileStorage()[source]

Determines if storage profiling is configured.

Returns:

True if storage profiling is configured.

Return type:

bool

HaveProfileTaskQueue()[source]

Determines if task queue profiling is configured.

Returns:

True if task queue profiling is configured.

Return type:

bool

HaveProfileTasks()[source]

Determines if tasks profiling is configured.

Returns:

True if task queue profiling is configured.

Return type:

bool

__init__()[source]

Initializes a profiling configuration object.

plaso.engine.engine module

The processing engine.

class plaso.engine.engine.BaseEngine[source]

Bases: object

Processing engine interface.

knowledge_base

knowledge base.

Type:

KnowledgeBase

BuildArtifactsRegistry(artifact_definitions_path, custom_artifacts_path)[source]

Builds an artificats definition registry.

Parameters:
  • artifact_definitions_path (str) – path to artifact definitions directory or file.

  • custom_artifacts_path (str) – path to custom artifact definitions directory or file.

Raises:

BadConfigOption – if artifact definitions cannot be read.

BuildCollectionFilters(environment_variables, user_accounts, artifact_filter_names=None, filter_file_path=None)[source]

Builds collection filters from artifacts or filter file if available.

Parameters:
  • environment_variables (list[EnvironmentVariableArtifact]) – environment variables.

  • user_accounts (list[UserAccountArtifact]) – user accounts.

  • artifact_filter_names (Optional[list[str]]) – names of artifact definitions that are used for filtering file system and Windows Registry key paths.

  • filter_file_path (Optional[str]) – path of filter file.

Raises:

InvalidFilter – if no valid file system find specifications are built.

classmethod CreateSession(artifact_filter_names=None, command_line_arguments=None, debug_mode=False, filter_file_path=None, preferred_encoding='utf-8')[source]

Creates a session attribute container.

Parameters:
  • artifact_filter_names (Optional[list[str]]) – names of artifact definitions that are used for filtering file system and Windows Registry key paths.

  • command_line_arguments (Optional[str]) – the command line arguments.

  • debug_mode (Option[bool]) – True if debug mode was enabled.

  • filter_file_path (Optional[str]) – path to a file with find specifications.

  • preferred_encoding (Optional[str]) – preferred encoding.

Returns:

session attribute container.

Return type:

Session

GetCollectionExcludedFindSpecs()[source]

Retrieves find specifications to exclude from collection.

Returns:

find specifications to exclude from collection.

Return type:

list[dfvfs.FindSpec]

GetCollectionIncludedFindSpecs()[source]

Retrieves find specifications to include in collection.

Returns:

find specifications to include in collection.

Return type:

list[dfvfs.FindSpec]

GetSourceFileSystem(file_system_path_spec, resolver_context=None)[source]

Retrieves the file system of the source.

Parameters:
  • file_system_path_spec (dfvfs.PathSpec) – path specifications of the source file system to process.

  • resolver_context (dfvfs.Context) – resolver context.

Returns:

file system and mount point path

specification. The mount point path specification refers to either a directory or a volume on a storage media device or image. It is needed by the dfVFS file system to indicate the base location of the file system.

Return type:

tuple[dfvfs.FileSystem, path.PathSpec]

Raises:

RuntimeError – if source file system path specification is not set.

PreprocessSource(file_system_path_specs, storage_writer, resolver_context=None)[source]

Preprocesses a source.

Parameters:
  • file_system_path_specs (list[dfvfs.PathSpec]) – path specifications of the source file systems to process.

  • storage_writer (StorageWriter) – storage writer.

  • resolver_context (Optional[dfvfs.Context]) – resolver context.

Returns:

system configurations found in

the source.

Return type:

list[SystemConfigurationArtifact]

SetStatusUpdateInterval(status_update_interval)[source]

Sets the status update interval.

Parameters:

status_update_interval (float) – status update interval.

__init__()[source]

Initializes an engine.

plaso.engine.extractors module

Extractor classes, used to extract information from sources.

class plaso.engine.extractors.EventDataExtractor(force_parser=False, parser_filter_expression=None)[source]

Bases: object

The event data extractor.

ParseDataStream(parser_mediator, file_entry, data_stream_name)[source]

Parses a data stream of a file entry with the enabled parsers.

Parameters:
  • parser_mediator (ParserMediator) – mediates interactions between parsers and other components, such as storage and dfVFS.

  • file_entry (dfvfs.FileEntry) – file entry.

  • data_stream_name (str) – data stream name.

Raises:

RuntimeError – if the file-like object or the parser object is missing.

ParseFileEntryMetadata(parser_mediator, file_entry)[source]

Parses the file entry metadata such as file system data.

Parameters:
  • parser_mediator (ParserMediator) – mediates interactions between parsers and other components, such as storage and dfVFS.

  • file_entry (dfvfs.FileEntry) – file entry.

ParseMetadataFile(parser_mediator, file_entry, data_stream_name)[source]

Parses a metadata file.

Parameters:
  • parser_mediator (ParserMediator) – mediates interactions between parsers and other components, such as storage and dfVFS.

  • file_entry (dfvfs.FileEntry) – file entry.

  • data_stream_name (str) – data stream name.

__init__(force_parser=False, parser_filter_expression=None)[source]

Initializes an event extractor.

Parameters:
  • force_parser (Optional[bool]) – True if a specified parser should be forced to be used to extract events.

  • parser_filter_expression (Optional[str]) –

    parser filter expression, where None represents all parsers and plugins.

    A parser filter expression is a comma separated value string that denotes which parsers and plugins should be used. See filters/parser_filter.py for details of the expression syntax.

class plaso.engine.extractors.PathSpecExtractor[source]

Bases: object

Path specification extractor.

A path specification extractor extracts path specification from a source directory, file or storage media device or image.

ExtractPathSpecs(path_spec, find_specs=None, recurse_file_system=True, resolver_context=None)[source]

Extracts path specification from a specific source.

Parameters:
  • path_spec (dfvfs.PathSpec) – path specification.

  • find_specs (Optional[list[dfvfs.FindSpec]]) – find specifications used in path specification extraction.

  • recurse_file_system (Optional[bool]) – True if extraction should recurse into a file system.

  • resolver_context (Optional[dfvfs.Context]) – resolver context.

Yields:

dfvfs.PathSpec – path specification of a file entry found in the source.

plaso.engine.knowledge_base module

The artifact knowledge base object.

The knowledge base is filled by user provided input and the pre-processing phase. It is intended to provide successive phases, like the parsing and analysis phases, with essential information like the time zone and codepage of the source data.

class plaso.engine.knowledge_base.KnowledgeBase[source]

Bases: object

The knowledge base.

AddEnvironmentVariable(environment_variable)[source]

Adds an environment variable.

Parameters:

environment_variable (EnvironmentVariableArtifact) – environment variable artifact.

Raises:

KeyError – if the environment variable already exists.

GetEnvironmentVariable(name)[source]

Retrieves an environment variable.

Parameters:

name (str) – name of the environment variable.

Returns:

environment variable artifact or None

if there was no value set for the given name.

Return type:

EnvironmentVariableArtifact

GetEnvironmentVariables()[source]

Retrieves the environment variables.

Returns:

environment variable artifacts.

Return type:

list[EnvironmentVariableArtifact]

GetHostname()[source]

Retrieves the hostname related to the event.

If the hostname is not stored in the event it is determined based on the preprocessing information that is stored inside the storage file.

Returns:

hostname.

Return type:

str

GetValue(identifier, default_value=None)[source]

Retrieves a value by identifier.

Parameters:
  • identifier (str) – case insensitive unique identifier for the value.

  • default_value (object) – default value.

Returns:

value or default value if not available.

Return type:

object

Raises:

TypeError – if the identifier is not a string type.

ReadSystemConfigurationArtifact(system_configuration)[source]

Reads the knowledge base values from a system configuration artifact.

Note that this overwrites existing values in the knowledge base.

Parameters:

system_configuration (SystemConfigurationArtifact) – system configuration artifact.

SetActiveSession(session_identifier)[source]

Sets the active session.

Parameters:

session_identifier (str) – session identifier where None represents the default active session.

SetCodepage(codepage)[source]

Sets the codepage.

Parameters:

codepage (str) – codepage.

Raises:

ValueError – if the codepage is not supported.

SetEnvironmentVariable(environment_variable)[source]

Sets an environment variable.

Parameters:

environment_variable (EnvironmentVariableArtifact) – environment variable artifact.

SetHostname(hostname)[source]

Sets a hostname.

Parameters:

hostname (HostnameArtifact) – hostname artifact.

SetLanguage(language)[source]

Sets the language.

Parameters:

language (str) – language.

SetTimeZone(time_zone)[source]

Sets the time zone.

Parameters:

time_zone (str) – time zone.

Raises:

ValueError – if the time zone is not supported.

SetValue(identifier, value)[source]

Sets a value by identifier.

Parameters:
  • identifier (str) – case insensitive unique identifier for the value.

  • value (object) – value.

Raises:

TypeError – if the identifier is not a string type.

__init__()[source]

Initializes a knowledge base.

property codepage

codepage of the current session.

Type:

str

property language

language of the current session.

Type:

str

property timezone

time zone of the current session.

Type:

datetime.tzinfo

plaso.engine.logger module

The engine sub module logger.

plaso.engine.path_filters module

Path filters.

Path filters are specified in filter files and are used during collection to include or exclude file system paths.

class plaso.engine.path_filters.PathCollectionFiltersHelper[source]

Bases: object

Path collection filters helper.

excluded_file_system_find_specs

file system find specifications of paths to exclude from the collection.

Type:

list[dfvfs.FindSpec]

included_file_system_find_specs

file system find specifications of paths to include in the collection.

Type:

list[dfvfs.FindSpec]

BuildFindSpecs(path_filters, environment_variables=None)[source]

Builds find specifications from path filters.

Parameters:
__init__()[source]

Initializes a collection filters helper.

class plaso.engine.path_filters.PathFilter(filter_type, description=None, path_separator='/', paths=None)[source]

Bases: object

Path filter.

description

description of the purpose of the filter or None if not set.

Type:

str

filter_type

indicates if the filter should include or excludes paths during collection.

Type:

str

path_separator

path segment separator.

Type:

str

paths

paths to filter.

Type:

list[str]

FILTER_TYPE_EXCLUDE = 'exclude'
FILTER_TYPE_INCLUDE = 'include'
__init__(filter_type, description=None, path_separator='/', paths=None)[source]

Initializes a path filter.

Parameters:
  • filter_type (str) – indicates if the filter should include or excludes paths during collection.

  • description (Optional[str]) – description of the purpose of the filter.

  • path_separator (Optional[str]) – path segment separator.

  • paths (Optional[list[str]]) – paths to filter.

Raises:

ValueError – if the filter type contains an unsupported value.

plaso.engine.path_helper module

The path helper.

class plaso.engine.path_helper.PathHelper[source]

Bases: object

Class that implements the path helper.

classmethod ExpandGlobStars(path, path_separator)[source]

Expands globstars “**” in a path.

A globstar “**” will recursively match all files and zero or more directories and subdirectories.

By default the maximum recursion depth is 10 subdirectories, a numeric values after the globstar, such as “**5”, can be used to define the maximum recursion depth.

Parameters:
  • path (str) – path to be expanded.

  • path_separator (str) – path segment separator.

Returns:

String path expanded for each glob.

Return type:

list[str]

classmethod ExpandUsersVariablePath(path, path_separator, user_accounts)[source]

Expands a path with a users variable, such as %%users.homedir%%.

Parameters:
  • path (str) – path with users variable.

  • path_separator (str) – path segment separator.

  • user_accounts (list[UserAccountArtifact]) – user accounts.

Returns:

paths for which the users variables have been expanded.

Return type:

list[str]

classmethod ExpandWindowsPath(path, environment_variables)[source]

Expands a Windows path containing environment variables.

Parameters:
  • path (str) – Windows path with environment variables.

  • environment_variables (list[EnvironmentVariableArtifact]) – environment variables.

Returns:

expanded Windows path.

Return type:

str

classmethod ExpandWindowsPathSegments(path_segments, environment_variables)[source]

Expands a Windows path segments containing environment variables.

Parameters:
  • path_segments (list[str]) – Windows path segments with environment variables.

  • environment_variables (list[EnvironmentVariableArtifact]) – environment variables.

Returns:

expanded Windows path segments.

Return type:

list[str]

classmethod GetDisplayNameForPathSpec(path_spec)[source]

Retrieves the display name of a path specification.

Parameters:

path_spec (dfvfs.PathSpec) – path specification.

Returns:

human readable version of the path specification or None if no path

specification was provided.

Return type:

str

classmethod GetPathSegmentSeparator(path_spec)[source]

Retrieves the path segment separator of path specification.

Parameters:

path_spec (dfvfs.PathSpec) – path specification.

Returns:

path segment separator.

Return type:

str

classmethod GetRelativePathForPathSpec(path_spec)[source]

Retrieves the relative path of a path specification.

If a mount path is defined the path will be relative to the mount point, otherwise the path is relative to the root of the file system that is used by the path specification.

Parameters:

path_spec (dfvfs.PathSpec) – path specification.

Returns:

relative path or None.

Return type:

str

classmethod GetWindowsSystemPath(path, environment_variables)[source]

Retrieves a Windows system path.

Parameters:
  • path (str) – Windows path with environment variables.

  • environment_variables (list[EnvironmentVariableArtifact]) – environment variables.

Returns:

Windows system path and filename.

Return type:

tuple[str, str]

plaso.engine.process_info module

Information about running process.

class plaso.engine.process_info.ProcessInfo(pid)[source]

Bases: object

Provides information about a running process.

GetUsedMemory()[source]

Retrieves the amount of memory used by the process.

Returns:

amount of memory in bytes used by the process or None

if not available.

Return type:

int

__init__(pid)[source]

Initializes process information.

Parameters:

pid (int) – process identifier (PID).

Raises:
  • IOError – If the process identified by the PID does not exist.

  • OSError – If the process identified by the PID does not exist.

plaso.engine.processing_status module

Processing status classes.

class plaso.engine.processing_status.EventsStatus[source]

Bases: object

The status of the events.

number_of_duplicate_events

number of duplicate events, not including the original.

Type:

int

number_of_events_from_time_slice

number of events from time slice.

Type:

int

number_of_filtered_events

number of events excluded by the event filter.

Type:

int

number_of_macb_grouped_events

number of events grouped based on MACB.

Type:

int

total_number_of_events

total number of events in the storage file.

Type:

int

__init__()[source]

Initializes an events status.

class plaso.engine.processing_status.ProcessStatus[source]

Bases: object

The status of an individual process.

display_name

human readable of the file entry currently being processed by the process.

Type:

str

identifier

process identifier.

Type:

str

number_of_consumed_event_data

total number of event data consumed by the process.

Type:

int

number_of_consumed_event_data_delta

number of event data consumed by the process since the last status update.

Type:

int

number_of_consumed_events

total number of events consumed by the process.

Type:

int

number_of_consumed_event_tags

total number of event tags consumed by the process.

Type:

int

number_of_consumed_event_tags_delta

number of event tags consumed by the process since the last status update.

Type:

int

number_of_consumed_events

total number of events consumed by the process.

Type:

int

number_of_consumed_events_delta

number of events consumed by the process since the last status update.

Type:

int

number_of_consumed_reports

total number of event reports consumed by the process.

Type:

int

number_of_consumed_reports_delta

number of event reports consumed by the process since the last status update.

Type:

int

number_of_consumed_sources

total number of event sources consumed by the process.

Type:

int

number_of_consumed_sources_delta

number of event sources consumed by the process since the last status update.

Type:

int

number_of_produced_event_data

total number of event data produced by the process.

Type:

int

number_of_produced_event_data_delta

number of event data produced by the process since the last status update.

Type:

int

number_of_produced_event_tags

total number of event tags produced by the process.

Type:

int

number_of_produced_event_tags_delta

number of event tags produced by the process since the last status update.

Type:

int

number_of_produced_events

total number of events produced by the process.

Type:

int

number_of_produced_events_delta

number of events produced by the process since the last status update.

Type:

int

number_of_produced_reports

total number of event reports produced by the process.

Type:

int

number_of_produced_reports_delta

number of event reports produced by the process since the last status update.

Type:

int

number_of_produced_sources

total number of event sources produced by the process.

Type:

int

number_of_produced_sources_delta

number of event sources produced by the process since the last status update.

Type:

int

pid

process identifier (PID).

Type:

int

status

human readable status indication such as “Hashing” or “Idle”.

Type:

str

used_memory

size of used memory in bytes.

Type:

int

UpdateNumberOfEventData(number_of_consumed_event_data, number_of_produced_event_data)[source]

Updates the number of event data.

Parameters:
  • number_of_consumed_event_data (int) – total number of event data consumed by the process.

  • number_of_produced_event_data (int) – total number of event data produced by the process.

Raises:

ValueError – if the consumed or produced number of event data is smaller than the value of the previous update.

UpdateNumberOfEventReports(number_of_consumed_reports, number_of_produced_reports)[source]

Updates the number of event reports.

Parameters:
  • number_of_consumed_reports (int) – total number of event reports consumed by the process.

  • number_of_produced_reports (int) – total number of event reports produced by the process.

Raises:

ValueError – if the consumed or produced number of event reports is smaller than the value of the previous update.

UpdateNumberOfEventSources(number_of_consumed_sources, number_of_produced_sources)[source]

Updates the number of event sources.

Parameters:
  • number_of_consumed_sources (int) – total number of event sources consumed by the process.

  • number_of_produced_sources (int) – total number of event sources produced by the process.

Raises:

ValueError – if the consumed or produced number of event sources is smaller than the value of the previous update.

UpdateNumberOfEventTags(number_of_consumed_event_tags, number_of_produced_event_tags)[source]

Updates the number of event tags.

Parameters:
  • number_of_consumed_event_tags (int) – total number of event tags consumed by the process.

  • number_of_produced_event_tags (int) – total number of event tags produced by the process.

Raises:

ValueError – if the consumed or produced number of event tags is smaller than the value of the previous update.

UpdateNumberOfEvents(number_of_consumed_events, number_of_produced_events)[source]

Updates the number of events.

Parameters:
  • number_of_consumed_events (int) – total number of events consumed by the process.

  • number_of_produced_events (int) – total number of events produced by the process.

Raises:

ValueError – if the consumed or produced number of events is smaller than the value of the previous update.

__init__()[source]

Initializes a process status.

class plaso.engine.processing_status.ProcessingStatus[source]

Bases: object

The status of the overall extraction process (processing).

aborted

True if processing was aborted.

Type:

bool

error_path_specs

path specifications that caused critical errors during processing.

Type:

list[dfvfs.PathSpec]

events_status

status information about events.

Type:

EventsStatus

foreman_status

foreman processing status.

Type:

ProcessingStatus

start_time

time that the processing was started. Contains the number of micro seconds since January 1, 1970, 00:00:00 UTC.

Type:

float

tasks_status

status information about tasks.

Type:

TasksStatus

UpdateEventsStatus(events_status)[source]

Updates the events status.

Parameters:

events_status (EventsStatus) – status information about events.

UpdateForemanStatus(identifier, status, pid, used_memory, display_name, number_of_consumed_sources, number_of_produced_sources, number_of_consumed_event_data, number_of_produced_event_data, number_of_consumed_events, number_of_produced_events, number_of_consumed_event_tags, number_of_produced_event_tags, number_of_consumed_reports, number_of_produced_reports)[source]

Updates the status of the foreman.

Parameters:
  • identifier (str) – foreman identifier.

  • status (str) – human readable status indication such as “Hashing” or “Idle”.

  • pid (int) – process identifier (PID).

  • used_memory (int) – size of used memory in bytes.

  • display_name (str) – human readable of the file entry currently being processed by the foreman.

  • number_of_consumed_sources (int) – total number of event sources consumed by the foreman.

  • number_of_produced_sources (int) – total number of event sources produced by the foreman.

  • number_of_consumed_event_data (int) – total number of event data consumed by the foreman.

  • number_of_produced_event_data (int) – total number of event data produced by the foreman.

  • number_of_consumed_events (int) – total number of events consumed by the foreman.

  • number_of_produced_events (int) – total number of events produced by the foreman.

  • number_of_consumed_event_tags (int) – total number of event tags consumed by the foreman.

  • number_of_produced_event_tags (int) – total number of event tags produced by the foreman.

  • number_of_consumed_reports (int) – total number of event reports consumed by the process.

  • number_of_produced_reports (int) – total number of event reports produced by the process.

UpdateTasksStatus(tasks_status)[source]

Updates the tasks status.

Parameters:

tasks_status (TasksStatus) – status information about tasks.

UpdateWorkerStatus(identifier, status, pid, used_memory, display_name, number_of_consumed_sources, number_of_produced_sources, number_of_consumed_event_data, number_of_produced_event_data, number_of_consumed_events, number_of_produced_events, number_of_consumed_event_tags, number_of_produced_event_tags, number_of_consumed_reports, number_of_produced_reports)[source]

Updates the status of a worker.

Parameters:
  • identifier (str) – worker identifier.

  • status (str) – human readable status indication such as “Hashing” or “Idle”.

  • pid (int) – process identifier (PID).

  • used_memory (int) – size of used memory in bytes.

  • display_name (str) – human readable of the file entry currently being processed by the worker.

  • number_of_consumed_sources (int) – total number of event sources consumed by the worker.

  • number_of_produced_sources (int) – total number of event sources produced by the worker.

  • number_of_consumed_event_data (int) – total number of event data consumed by the worker.

  • number_of_produced_event_data (int) – total number of event data produced by the worker.

  • number_of_consumed_events (int) – total number of events consumed by the worker.

  • number_of_produced_events (int) – total number of events produced by the worker.

  • number_of_consumed_event_tags (int) – total number of event tags consumed by the worker.

  • number_of_produced_event_tags (int) – total number of event tags produced by the worker.

  • number_of_consumed_reports (int) – total number of event reports consumed by the process.

  • number_of_produced_reports (int) – total number of event reports produced by the process.

__init__()[source]

Initializes a processing status.

property workers_status

The worker status objects sorted by identifier.

class plaso.engine.processing_status.TasksStatus[source]

Bases: object

The status of the tasks.

number_of_abandoned_tasks

number of abandoned tasks.

Type:

int

number_of_queued_tasks

number of active tasks.

Type:

int

number_of_tasks_pending_merge

number of tasks pending merge.

Type:

int

number_of_tasks_processing

number of tasks processing.

Type:

int

total_number_of_tasks

total number of tasks.

Type:

int

__init__()[source]

Initializes a tasks status.

plaso.engine.profilers module

The profiler classes.

class plaso.engine.profilers.AnalyzersProfiler(identifier, configuration)[source]

Bases: CPUTimeProfiler

The analyzers profiler.

class plaso.engine.profilers.CPUTimeMeasurement[source]

Bases: object

The CPU time measurement.

start_sample_time

start sample time or None if not set.

Type:

float

total_cpu_time

total CPU time or None if not set.

Type:

float

SampleStart()[source]

Starts measuring the CPU time.

SampleStop()[source]

Stops measuring the CPU time.

__init__()[source]

Initializes the CPU time measurement.

class plaso.engine.profilers.CPUTimeProfiler(identifier, configuration)[source]

Bases: SampleFileProfiler

The CPU time profiler.

StartTiming(profile_name)[source]

Starts timing CPU time.

Parameters:

profile_name (str) – name of the profile to sample.

StopTiming(profile_name)[source]

Stops timing CPU time.

Parameters:

profile_name (str) – name of the profile to sample.

class plaso.engine.profilers.MemoryProfiler(identifier, configuration)[source]

Bases: SampleFileProfiler

The memory profiler.

Sample(profile_name, used_memory)[source]

Takes a sample for profiling.

Parameters:
  • profile_name (str) – name of the profile to sample.

  • used_memory (int) – amount of used memory in bytes.

class plaso.engine.profilers.ProcessingProfiler(identifier, configuration)[source]

Bases: CPUTimeProfiler

The processing profiler.

class plaso.engine.profilers.SampleFileProfiler(identifier, configuration)[source]

Bases: object

Shared functionality for sample file-based profilers.

classmethod IsSupported()[source]

Determines if the profiler is supported.

Returns:

True if the profiler is supported.

Return type:

bool

Start()[source]

Starts the profiler.

Stop()[source]

Stops the profiler.

__init__(identifier, configuration)[source]

Initializes a sample file profiler.

Sample files are gzip compressed UTF-8 encoded CSV files.

Parameters:
  • identifier (str) – identifier of the profiling session used to create the sample filename.

  • configuration (ProfilingConfiguration) – profiling configuration.

class plaso.engine.profilers.SerializersProfiler(identifier, configuration)[source]

Bases: CPUTimeProfiler

The serializers profiler.

class plaso.engine.profilers.StorageProfiler(identifier, configuration)[source]

Bases: SampleFileProfiler

The storage profiler.

Sample(profile_name, operation, description, data_size, compressed_data_size)[source]

Takes a sample of data read or written for profiling.

Parameters:
  • profile_name (str) – name of the profile to sample.

  • operation (str) – operation, either ‘read’ or ‘write’.

  • description (str) – description of the data read.

  • data_size (int) – size of the data read in bytes.

  • compressed_data_size (int) – size of the compressed data read in bytes.

StartTiming(profile_name)[source]

Starts timing CPU time.

Parameters:

profile_name (str) – name of the profile to sample.

StopTiming(profile_name)[source]

Stops timing CPU time.

Parameters:

profile_name (str) – name of the profile to sample.

class plaso.engine.profilers.TaskQueueProfiler(identifier, configuration)[source]

Bases: SampleFileProfiler

The task queue profiler.

Sample(tasks_status)[source]

Takes a sample of the status of queued tasks for profiling.

Parameters:

tasks_status (TasksStatus) – status information about tasks.

class plaso.engine.profilers.TasksProfiler(identifier, configuration)[source]

Bases: SampleFileProfiler

The tasks profiler.

Sample(task, status)[source]

Takes a sample of the status of a task for profiling.

Parameters:
  • task (Task) – a task.

  • status (str) – status.

plaso.engine.tagging_file module

Tagging file.

class plaso.engine.tagging_file.TaggingFile(path)[source]

Bases: object

Tagging file that defines one or more event tagging rules.

GetEventTaggingRules()[source]

Retrieves the event tagging rules from the tagging file.

Returns:

tagging rules, that consists of one or more

filter objects per label.

Return type:

dict[str, EventObjectFilter]

Raises:

TaggingFileError – if a filter expression cannot be compiled.

__init__(path)[source]

Initializes a tagging file.

Parameters:

path (str) – path to a file that contains one or more event tagging rules.

plaso.engine.timeliner module

The timeliner, which is used to generate events from event data.

class plaso.engine.timeliner.EventDataTimeliner(data_location=None, preferred_year=None, system_configurations=None)[source]

Bases: object

The event data timeliner.

number_of_produced_events

number of produced events.

Type:

int

parsers_counter

number of events per parser or parser plugin.

Type:

collections.Counter

ProcessEventData(storage_writer, event_data, event_data_stream)[source]

Generate events from event data.

Parameters:
SetPreferredTimeZone(time_zone_string)[source]

Sets the preferred time zone for zone-less date and time values.

Parameters:

time_zone_string (str) – time zone such as “Europe/Amsterdam” or None if the time zone determined by preprocessing or the default should be used.

Raises:

ValueError – if the time zone is not supported.

__init__(data_location=None, preferred_year=None, system_configurations=None)[source]

Initializes an event data timeliner.

Parameters:
  • data_location (Optional[str]) – path of the timeliner configuration file.

  • preferred_year (Optional[int]) – preferred initial year value for date-less date and time values.

  • system_configurations (Optional[list[SystemConfigurationArtifact]]) – system configurations.

plaso.engine.worker module

The event extraction worker.

class plaso.engine.worker.EventExtractionWorker(force_parser=False, parser_filter_expression=None)[source]

Bases: object

Event extraction worker.

The event extraction worker determines which parsers are suitable for parsing a particular file entry or data stream. The parsers extract relevant data from file system and or file content data. All extracted data is passed to the parser mediator for further processing.

last_activity_timestamp

timestamp received that indicates the last time activity was observed.

Type:

int

processing_status

human readable status indication such as: ‘Extracting’, ‘Hashing’.

Type:

str

GetAnalyzerNames()[source]

Gets the names of the active analyzers.

Returns:

names of active analyzers.

Return type:

list[str]

ProcessFileEntry(parser_mediator, file_entry)[source]

Processes a file entry.

Parameters:
  • parser_mediator (ParserMediator) – mediates interactions between parsers and other components, such as storage and dfVFS.

  • file_entry (dfvfs.FileEntry) – file entry.

ProcessPathSpec(parser_mediator, path_spec)[source]

Processes a path specification.

Parameters:
  • parser_mediator (ParserMediator) – mediates interactions between parsers and other components, such as storage and dfVFS.

  • path_spec (dfvfs.PathSpec) – path specification.

SetAnalyzersProfiler(analyzers_profiler)[source]

Sets the analyzers profiler.

Parameters:

analyzers_profiler (AnalyzersProfiler) – analyzers profile.

SetExtractionConfiguration(configuration)[source]

Sets the extraction configuration settings.

Parameters:

configuration (ExtractionConfiguration) – extraction configuration.

SetProcessingProfiler(processing_profiler)[source]

Sets the processing profiler.

Parameters:

processing_profiler (ProcessingProfiler) – processing profile.

SignalAbort()[source]

Signals the extraction worker to abort.

__init__(force_parser=False, parser_filter_expression=None)[source]

Initializes an event extraction worker.

Parameters:
  • force_parser (Optional[bool]) – True if a specified parser should be forced to be used to extract events.

  • parser_filter_expression (Optional[str]) –

    parser filter expression, where None represents all parsers and plugins.

    A parser filter expression is a comma separated value string that denotes which parsers and plugins should be used. See filters/parser_filter.py for details of the expression syntax.

    This function does not support presets, and requires a parser filter expression where presets have been expanded.

class plaso.engine.worker.EventExtractionWorkerVolumeScanner(*args: Any, **kwargs: Any)[source]

Bases: VolumeScanner

Volume scanner used by the event extraction worker.

GetBasePathSpecs(source_path_spec, options=None)[source]

Determines the base path specifications.

Parameters:
  • source_path_spec (dfvfs.PathSpec) – source path specification.

  • options (Optional[VolumeScannerOptions]) – volume scanner options. If None the default volume scanner options are used, which are defined in the VolumeScannerOptions class.

Returns:

path specifications.

Return type:

list[PathSpec]

Raises:

dfvfs.ScannerError – if the source path does not exists, or if the source path is not a file or directory, or if the format of or within the source file is not supported.

plaso.engine.yaml_filter_file module

YAML-based filter file.

class plaso.engine.yaml_filter_file.YAMLFilterFile[source]

Bases: object

YAML-based filter file.

A YAML-based filter file contains one or more path filters. description: Include filter with Linux paths. type: include path_separator: ‘/’ paths: - ‘/usr/bin’

Where: * description, is an optional description of the purpose of the path filter; * type, defines the filter type, which can be “include” or “exclude”; * path_separator, defines the path segment separator, which is “/” by default; * paths, defines regular expression of paths to filter on.

Note that the regular expression need to be defined per path segment, for example to filter “/usr/bin/echo” and “/usr/sbin/echo” the following expression could be defined “/usr/(bin|sbin)/echo”.

Note that when the path segment separator is defined as “" it needs to be escaped as “\”, since “" is used by the regular expression as escape character.

A path may contain path expansion attributes, for example: %{SystemRoot}\System32

ReadFromFile(path)[source]

Reads the path filters from the YAML-based filter file.

Parameters:

path (str) – path to a filter file.

Returns:

path filters.

Return type:

list[PathFilter]

plaso.engine.yaml_timeliner_file module

YAML-based timeliner configuration file.

class plaso.engine.yaml_timeliner_file.TimelinerDefinition(data_type)[source]

Bases: object

Timeliner definition.

attribute_mappings

date and time description (timestamp_desc) per attribute name.

Type:

dict[str, str]

data_type

event data type indicator.

Type:

str

place_holder_event

True if the timeliner should generate a placeholder event if no date and time values were found in the event data.

Type:

bool

__init__(data_type)[source]

Initializes a timeliner definition.

Parameters:

data_type (str) – event data type indicator.

class plaso.engine.yaml_timeliner_file.YAMLTimelinerConfigurationFile[source]

Bases: object

YAML-based timeliner configuration file.

A YAML-based timeliner configuration file contains one or more timeliner definitions. A timeliner definitions consists of:

data_type: ‘fs:stat’ attribute_mappings: - name: ‘access_time’

description: ‘Last Access Time’

place_holder_event: true

Where: * data_type, defines the corresponding event data type; * attribute_mappings, defines attribute mappings; * place_holder_event, defines if the timeliner should generate a placeholder

event.

ReadFromFile(path)[source]

Reads the timeliner definitions from a YAML file.

Parameters:

path (str) – path to a timeliner configuration file.

Yields:

TimelinerDefinition – a timeliner definition.

Module contents