AgentSoap Threat Intelligence Feeds

This document outlines the OSINT (Open Source Intelligence) feeds ingested by the AgentSoap platform to provide real-time risk scoring for agentic workflows.

Active Data Sources

1. URLhaus (by Abuse.ch)

  • Purpose: Domain and URL reputation.
  • Source: https://urlhaus.abuse.ch/downloads/csv_recent/
  • Format: CSV
  • Ingestion Frequency: Daily
  • Metrics: Flags domains used for malware distribution and phishing.

2. Job Board Scams (Recruitment Fraud)

  • Purpose: Specialized detection for recruitment fraud and "Work from Home" scams.
  • Source: https://raw.githubusercontent.com/fin-threat-intel/job-scams/main/scams.json
  • Format: JSON
  • Ingestion Frequency: Daily
  • Metrics: High-severity flags (Score: 100) for domains hosting fake data-entry or mystery shopper roles.

3. HuggingFace Agentic-Threats (Patterns)

  • Purpose: Semantic pattern matching for Prompt Injections.
  • Source: https://huggingface.co/api/datasets/agentic-threats/raw/main/patterns.json
  • Format: JSON
  • Ingestion Frequency: Daily
  • Metrics: High-confidence regex and semantic strings known to trigger behavior hijacking in LLMs.

3. Chainabuse (Wallets) - Planned

  • Purpose: Crypto wallet risk scoring.
  • Source: https://api.chainabuse.com/v1/reports
  • Metrics: Identifies wallets associated with scams and money laundering.

Data Schema (ThreatEntities)

All ingested threats are normalized into the threat_entities table:

Column Type Description
type Enum domain, vendor_name, routing_number, wallet_address, semantic_pattern
value String The unique identifier for the threat (e.g., scam.com).
risk_score Integer 0-100 indicating the severity.
source String The origin feed (e.g., URLhaus).
flag_reason String Human-readable explanation for the block.

Ingestion Pipeline

The ingestion is handled by the ingest:threat-feeds Artisan command. It uses atomic upsert operations to ensure data freshness without duplication.

Security Lab Live Sandbox

Test your payloads against the AgentSoap security logic. Enter a string below to see the generated implementation code.

Simulation Result
cURL Command
Python Snippet