plaso.multi_processing package¶
Submodules¶
plaso.multi_processing.analysis_process module¶
The multi-process analysis process.
-
class
plaso.multi_processing.analysis_process.
AnalysisProcess
(event_queue, storage_writer, knowledge_base, analysis_plugin, processing_configuration, data_location=None, event_filter_expression=None, **kwargs)[source]¶ Bases:
plaso.multi_processing.base_process.MultiProcessBaseProcess
Multi-processing analysis process.
plaso.multi_processing.base_process module¶
Base class for a process used in multi-processing.
-
class
plaso.multi_processing.base_process.
MultiProcessBaseProcess
(processing_configuration, enable_sigsegv_handler=False, **kwargs)[source]¶ Bases:
multiprocessing.context.Process
Multi-processing process interface.
-
rpc_port
¶ port number of the process status RPC server.
- Type
int
-
property
name
¶ process name.
- Type
str
-
plaso.multi_processing.engine module¶
The multi-process processing engine.
-
class
plaso.multi_processing.engine.
MultiProcessEngine
[source]¶ Bases:
plaso.engine.engine.BaseEngine
Multi-process engine base.
This class contains functionality to: * monitor and manage worker processes; * retrieve a process status information via RPC; * manage the status update thread.
plaso.multi_processing.logger module¶
The multi-processing sub module logger.
plaso.multi_processing.plaso_xmlrpc module¶
XML RPC server and client.
-
class
plaso.multi_processing.plaso_xmlrpc.
ThreadedXMLRPCServer
(callback)[source]¶ Bases:
plaso.multi_processing.rpc.RPCServer
Threaded XML RPC server.
-
class
plaso.multi_processing.plaso_xmlrpc.
XMLProcessStatusRPCClient
[source]¶ Bases:
plaso.multi_processing.plaso_xmlrpc.XMLRPCClient
XML process status RPC client.
-
class
plaso.multi_processing.plaso_xmlrpc.
XMLProcessStatusRPCServer
(callback)[source]¶ Bases:
plaso.multi_processing.plaso_xmlrpc.ThreadedXMLRPCServer
XML process status threaded RPC server.
-
class
plaso.multi_processing.plaso_xmlrpc.
XMLRPCClient
[source]¶ Bases:
plaso.multi_processing.rpc.RPCClient
XML RPC client.
plaso.multi_processing.psort module¶
The psort multi-processing engine.
-
class
plaso.multi_processing.psort.
PsortEventHeap
[source]¶ Bases:
object
Psort event heap.
-
PopEvent
()[source]¶ Pops an event from the heap.
- Returns
containing:
- str: identifier of the event MACB group or None if the event cannot
be grouped.
str: identifier of the event content. EventObject: event. EventData: event data. EventDataStream: event data stream.
- Return type
tuple
-
PopEvents
()[source]¶ Pops events from the heap.
- Yields
tuple –
containing:
- str: identifier of the event MACB group or None if the event cannot
be grouped.
str: identifier of the event content. EventObject: event. EventData: event data. EventDataStream: event data stream.
-
PushEvent
(event, event_data, event_data_stream)[source]¶ Pushes an event onto the heap.
- Parameters
event (EventObject) – event.
event_data (EventData) – event data.
event_data_stream (EventDataStream) – event data stream.
-
property
number_of_events
¶ number of events on the heap.
- Type
int
-
-
class
plaso.multi_processing.psort.
PsortMultiProcessEngine
(worker_memory_limit=None, worker_timeout=None)[source]¶ Bases:
plaso.multi_processing.engine.MultiProcessEngine
Psort multi-processing engine.
-
AnalyzeEvents
(knowledge_base_object, storage_writer, data_location, analysis_plugins, processing_configuration, event_filter=None, event_filter_expression=None, status_update_callback=None)[source]¶ Analyzes events in a plaso storage.
- Parameters
knowledge_base_object (KnowledgeBase) – contains information from the source data needed for processing.
storage_writer (StorageWriter) – storage writer.
data_location (str) – path to the location that data files should be loaded from.
analysis_plugins (dict[str, AnalysisPlugin]) – analysis plugins that should be run and their names.
processing_configuration (ProcessingConfiguration) – processing configuration.
event_filter (Optional[EventObjectFilter]) – event filter.
event_filter_expression (Optional[str]) – event filter expression.
status_update_callback (Optional[function]) – callback function for status updates.
- Raises
KeyboardInterrupt – if a keyboard interrupt was raised.
-
ExportEvents
(knowledge_base_object, storage_reader, output_module, processing_configuration, deduplicate_events=True, event_filter=None, status_update_callback=None, time_slice=None, use_time_slicer=False)[source]¶ Exports events using an output module.
- Parameters
knowledge_base_object (KnowledgeBase) – contains information from the source data needed for processing.
storage_reader (StorageReader) – storage reader.
output_module (OutputModule) – output module.
processing_configuration (ProcessingConfiguration) – processing configuration.
deduplicate_events (Optional[bool]) – True if events should be deduplicated.
event_filter (Optional[EventObjectFilter]) – event filter.
status_update_callback (Optional[function]) – callback function for status updates.
time_slice (Optional[TimeSlice]) – slice of time to output.
use_time_slicer (Optional[bool]) – True if the ‘time slicer’ should be used. The ‘time slicer’ will provide a context of events around an event of interest.
-
plaso.multi_processing.rpc module¶
The RPC client and server interface.
-
class
plaso.multi_processing.rpc.
RPCServer
(callback)[source]¶ Bases:
object
RPC server interface.
plaso.multi_processing.task_engine module¶
The task multi-process processing engine.
-
class
plaso.multi_processing.task_engine.
TaskMultiProcessEngine
(maximum_number_of_tasks=None, number_of_worker_processes=0, worker_memory_limit=None, worker_timeout=None)[source]¶ Bases:
plaso.multi_processing.engine.MultiProcessEngine
Class that defines the task multi-process engine.
This class contains functionality to: * monitor and manage extraction tasks; * merge results returned by extraction workers.
-
ProcessSources
(session, source_path_specs, storage_writer, processing_configuration, enable_sigsegv_handler=False, status_update_callback=None)[source]¶ Processes the sources and extract events.
- Parameters
session (Session) – session in which the sources are processed.
source_path_specs (list[dfvfs.PathSpec]) – path specifications of the sources to process.
storage_writer (StorageWriter) – storage writer for a session storage.
processing_configuration (ProcessingConfiguration) – processing configuration.
enable_sigsegv_handler (Optional[bool]) – True if the SIGSEGV handler should be enabled.
status_update_callback (Optional[function]) – callback function for status updates.
- Returns
processing status.
- Return type
-
plaso.multi_processing.task_manager module¶
The task manager.
-
class
plaso.multi_processing.task_manager.
TaskManager
[source]¶ Bases:
object
Manages tasks and tracks their completion and status.
A task being tracked by the manager must be in exactly one of the following states:
- abandoned: a task assumed to be abandoned because a tasks that has been
queued or was processing exceeds the maximum inactive time.
merging: a task that is being merged by the engine.
- pending_merge: the task has been processed and is ready to be merged with
the session storage.
- processed: a worker has completed processing the task, but it is not ready
to be merged into the session storage.
processing: a worker is processing the task.
- queued: the task is waiting for a worker to start processing it. It is also
possible that a worker has already completed the task, but no status update was collected from the worker while it processed the task.
Once the engine reports that a task is completely merged, it is removed from the task manager.
Tasks are considered “pending” when there is more work that needs to be done to complete these tasks. Pending applies to tasks that are: * not abandoned; * abandoned, but need to be retried.
Abandoned tasks without corresponding retry tasks are considered “failed” when the foreman is done processing.
-
CheckTaskToMerge
(task)[source]¶ Checks if the task should be merged.
- Parameters
task (Task) – task.
- Returns
True if the task should be merged.
- Return type
bool
- Raises
KeyError – if the task was not queued, processing or abandoned.
-
CompleteTask
(task)[source]¶ Completes a task.
The task is complete and can be removed from the task manager.
- Parameters
task (Task) – task.
- Raises
KeyError – if the task was not merging.
-
CreateRetryTask
()[source]¶ Creates a task that to retry a previously abandoned task.
- Returns
- a task that was abandoned but should be retried or None if there are
no abandoned tasks that should be retried.
- Return type
-
CreateTask
(session_identifier, storage_format='sqlite')[source]¶ Creates a task.
- Parameters
session_identifier (str) – the identifier of the session the task is part of.
storage_format (Optional[str]) – the storage format that the task should be stored in.
- Returns
task attribute container.
- Return type
-
GetFailedTasks
()[source]¶ Retrieves all failed tasks.
Failed tasks are tasks that were abandoned and have no retry task once the foreman is done processing.
- Returns
tasks.
- Return type
list[Task]
-
GetProcessedTaskByIdentifier
(task_identifier)[source]¶ Retrieves a task that has been processed.
- Parameters
task_identifier (str) – unique identifier of the task.
- Returns
a task that has been processed.
- Return type
- Raises
KeyError – if the task was not processing, queued or abandoned.
-
GetStatusInformation
()[source]¶ Retrieves status information about the tasks.
- Returns
tasks status information.
- Return type
-
GetTaskPendingMerge
(current_task)[source]¶ Retrieves the first task that is pending merge or has a higher priority.
This function will check if there is a task with a higher merge priority than the current_task being merged. If so, that task with the higher priority is returned.
-
HasPendingTasks
()[source]¶ Determines if there are tasks running or in need of retrying.
- Returns
- True if there are tasks that are active, ready to be merged or
need to be retried.
- Return type
bool
-
RemoveTask
(task)[source]¶ Removes an abandoned task.
- Parameters
task (Task) – task.
- Raises
KeyError – if the task was not abandoned or the task was abandoned and was not retried.
-
SampleTaskStatus
(task, status)[source]¶ Takes a sample of the status of the task for profiling.
- Parameters
task (Task) – a task.
status (str) – status.
-
StartProfiling
(configuration, identifier)[source]¶ Starts profiling.
- Parameters
configuration (ProfilingConfiguration) – profiling configuration.
identifier (str) – identifier of the profiling session used to create the sample filename.
plaso.multi_processing.worker_process module¶
The multi-process worker process.
-
class
plaso.multi_processing.worker_process.
WorkerProcess
(task_queue, storage_writer, collection_filters_helper, knowledge_base, session_identifier, processing_configuration, **kwargs)[source]¶ Bases:
plaso.multi_processing.base_process.MultiProcessBaseProcess
Multi-processing worker process.