This guide outlines the process of updating existing complex xLEAPP modules to include LAVA output. These modules typically process a single group of file pattern searches but generate multiple HTML and TSV outputs.
Note that if you are just attempting to split output into multiple HTML pages or just handle HTML manually for some other reason but the LAVA and TSV is a single table/output you can probably still use the instructions found in Updating Modules for Automatic Output Generation. (Refer to carCD.py for an example of manual HTML with automatic LAVA/TSV processing)
- Update the
__artifacts_v2__block - Modify imports (add LAVA and Media Manager functions)
- Handle Media Files (if applicable)
- Add LAVA output generation to the main function
- Use special data type handlers for LAVA output
Modify the __artifacts_v2__ dictionary to include all required fields for the complex artifact. This dictionary should be the first thing in the script, before any imports or other code. Make sure to set output_types to "none" since we're not using the artifact processor for automated output generation. Additionally, include a "function" key with the name of the main processing function as its value.
__artifacts_v2__ = {
"complex_artifact": {
"name": "Complex Artifact Name",
"description": "Description of the complex artifact",
"author": "@AuthorUsername",
"version": "1.0",
"date": "2023-05-24",
"requirements": "none",
"category": "Complex Artifact Category",
"notes": "",
"paths": ('Path/to/complex/artifact/files',),
"output_types": "none", # Changed from ["HTML", "TSV"] to "none"
"function": "get_complex_artifact" # This should match the name of your main processing function
}
}Add imports for LAVA functions and, if your module handles media, the Media Manager functions from ilapfuncs.
# Existing imports for your module...
# import scripts.artifact_report # For manual HTML reports
# from scripts.ilapfuncs import tsv # For manual TSV
# Add these for LAVA and Media Management:
from scripts.lavafuncs import lava_process_artifact, lava_insert_sqlite_data
from scripts.ilapfuncs import (
check_in_media,
check_in_embedded_media,
get_file_path, # If you use it to find files
# ... other necessary ilapfuncs ...
)
import os # Often needed for path joining with 'data' subfolderIf your complex artifact involves media files, use the Media Manager functions (check_in_media or check_in_embedded_media) to process them. This step should occur before you populate your data_list with media references and before you call lava_process_artifact.
-
Get
artifact_info: Retrieve the artifact's metadata dictionary from__artifacts_v2__using the current function's name.# Inside your main processing function (e.g., get_complex_artifact) # This assumes 'complex_artifact' is the key in __artifacts_v2__ # and 'get_complex_artifact' is the name of the current function. # This dict is passed to check_in_media or check_in_embedded_media. current_artifact_info = __artifacts_v2__["complex_artifact"]
-
Call
check_in_mediaorcheck_in_embedded_media: These functions return amedia_ref_id(string), which you then put into yourdata_list.-
For files on disk (
check_in_media):image_file_pattern = "**/path/to/image.jpg" # Pattern to find the media file found_image_path_in_extraction = get_file_path(files_found, image_file_pattern) media_ref_id = None if found_image_path_in_extraction: media_ref_id = check_in_media( artifact_info=current_artifact_info, report_folder=report_folder, seeker=seeker, files_found=files_found, file_path=image_file_pattern, name="Descriptive Name for Media Item" # Optional ) # Add media_ref_id (or placeholder if None) to your data_list # e.g., data_list_for_part1.append((timestamp, event_details, media_ref_id))
-
For embedded binary data (
check_in_embedded_media):binary_media_data = record['blob_column'] # Actual bytes of the media source_db_path = get_file_path(files_found, "**/source_app.db") media_ref_id = None if binary_media_data and source_db_path: media_ref_id = check_in_embedded_media( artifact_info=current_artifact_info, report_folder=report_folder, seeker=seeker, source_file=source_db_path, data=binary_media_data, name="Embedded Media Item Name" # Optional ) # Add media_ref_id to your data_list
-
-
Update
data_headers: For eachdata_listthat will containmedia_ref_ids, ensure the correspondingdata_headerslist marks that column with the type'media'.# For data_headers_part1 if it has a media column: data_headers_part1 = [('Timestamp', 'datetime'), 'Description', ('Photo', 'media')] # # For data_headers_part2 if it also has media: data_headers_part2 = ['User', ('Attachment', 'media', 'max-width:50px;')] # Optional style for HTML
This
'media'type is essential forlava_process_artifactto correctly handle themedia_ref_id. It also helps the manual HTML report generation if you have a helper function that understands this tuple format for media.
After processing data (including any media calls), generating your manual HTML and TSV reports, add the LAVA output generation for each distinct dataset.
# In your main processing function, e.g., get_complex_artifact
def get_complex_artifact(files_found, report_folder, seeker, wrap_text, timezone_offset):
data_list1 = []
data_list2 = []
data_headers1 = ['Column1', 'Column2', 'Column3']
data_headers2 = ['ColumnA', 'ColumnB', 'ColumnC']
for file_found in files_found:
# Process data and populate data_list1 and data_list2
# ...
# Generate HTML report for the first artifact
report1 = ArtifactHtmlReport('Complex Artifact - Part 1')
report1.start_artifact_report(report_folder, 'Complex Artifact - Part 1')
report1.add_script()
report1.write_artifact_data_table(data_headers1, data_list1, file_found)
report1.end_artifact_report()
# Generate TSV for the first artifact
tsv(report_folder, data_headers1, data_list1, 'Complex Artifact - Part 1')
# Generate HTML report for the second artifact
report2 = ArtifactHtmlReport('Complex Artifact - Part 2')
report2.start_artifact_report(report_folder, 'Complex Artifact - Part 2')
report2.add_script()
report2.write_artifact_data_table(data_headers2, data_list2, file_found)
report2.end_artifact_report()
# Generate TSV for the second artifact
tsv(report_folder, data_headers2, data_list2, 'Complex Artifact - Part 2')
# Generate LAVA output
category = "Complex Artifact Category"
module_name = "get_complex_artifact"
# Add special data type handlers for LAVA output
data_headers1[0] = (data_headers1[0], 'datetime')
data_headers2[2] = (data_headers2[2], 'date')
# Process first artifact for LAVA
table_name1, object_columns1, column_map1 = lava_process_artifact(category, module_name, 'Complex Artifact - Part 1', data_headers1, len(data_list1))
lava_insert_sqlite_data(table_name1, data_list1, object_columns1, data_headers1, column_map1)
# Process second artifact for LAVA
table_name2, object_columns2, column_map2 = lava_process_artifact(category, module_name, 'Complex Artifact - Part 2', data_headers2, len(data_list2))
lava_insert_sqlite_data(table_name2, data_list2, object_columns2, data_headers2, column_map2)- The
output_typesin__artifacts_v2__is set to"none"because any HTML/TSV generation is manual. LAVA output is added by directly callinglava_process_artifactandlava_insert_sqlite_data. - Call LAVA functions for each distinct data table you want to represent in LAVA.
- The
categorycomes from your__artifacts_v2__entry. Themodule_nameforlava_process_artifactis the name of your Python artifact processing function (e.g.,get_complex_artifact). Theartifact_name_for_lavashould be a unique, descriptive name for that specific table being generated. - Media Handling:
- Use
check_in_media/check_in_embedded_mediafromilapfuncs.py. - Pass the
datasubfolder path (e.g.,os.path.join(report_folder, 'data')) as thereport_folderargument to these media functions. - These functions handle copying/linking media and creating LAVA database entries (
media_items,media_references). - The
media_ref_idreturned is placed in yourdata_list. lava_process_artifactinterprets thismedia_ref_idcorrectly when the column header is typed as'media'.
- Use
These changes will add LAVA output capability to your complex module while maintaining its existing HTML and TSV outputs and utilizing special data type handlers for LAVA.