Deep SDR

From CECS wiki
Jump to navigation Jump to search


This is an idea page for designs for an SDR project involving deep learning.

This could be part of an advanced repeater controller, or it could be built separately, like a bot that talks to the repeater via a secondary radio or out of band channel.

Implementation approach[edit]

  1. Build sw/hw radio to collect data
  2. feed training data to deep net
    1. voice recognition: extract transcript
      • use existing OSS for recognition?
    2. voice semantics: extract call signs for labels
      • handle unlabeled voices
    3. voice finger printing: match voices to call signs
    4. attempt fingerprinting radios: match voices to radios
  3. program responses to specific voices
    • greetings
    • mailboxes
    • RDF

Detailed suggested implementation[edit]

Stage 1

  • (MOSTLY COMPLETE) Use GNU radio libraries or other tools to use SDR to record transmissions from various specific radio channels and save to permanent storage
    • store a short sample in IQ format of the start and end of transmission
    • use RF squelch, sub audible tone squelch, and repeater courtesy tones to recognize transmission start and end
    • store the entire demodulated audio sample (10s-5m per sample)
  • Build an authenticated web interface for privileged users to browse and manage samples
  • possibly design a cheap remote module that can be networked, crowd sourced, and/or distributed with a central data collection server

Stage 2:

  • Use voice recognition to extract text of the speech in the sample
  • Use parsing and AI techniques to identify the speaker from the text (call sign extraction) and label samples
  • Build an authenticated web interface for privileged users to browse, manage, verify, and edit sample metadata

Stage 3:

  • Use AI methods including deep learning to do radio (from IQ) and voice (from audio) fingerprinting of labeled samples
  • Verify labels are correct
  • Build an authenticated web interface for privileged users to browse, manage, verify, and edit samples and metadata
  • Build a repeater bridge to query metadata or flag recent samples

Stage 4:

  • implement various actions based on identification information (brainstorm possibilities)
  • combine identification information with other SDR projects
  • Build a web interface to assign actions to identities and view past action history
  • Build a repeater bridge to activate actions for recent recognized and unrecognized identities
  • More ideas in ARC repeater bot

Data collection[edit]

  • hardware: RTL-SDR on a pc or pi
  • receive on one or more repeater outputs
    • bandwidth, segregate into channels
    • squelch detection
      • CTCSS recognition
      • FM detection, audio volume detection
      • signal strength
    • filter
      • remove ctcss
      • cut at squelch transitions
      • split into manageable chunk sizes
  • save data
    • timestamp, frequency
    • demodulated data
    • I/Q data (later?)
    • voice recognition transcript
    • codec2 compression??

implementation[edit]

Components[edit]

SDR RX
multiple stations, collect samples and inject into MQTT
MQTT bus
distribute real time events and data
compute modules
process samples in real time and asynchronously and save to database
web server / database
archive samples
UI for labeling, browsing of samples
API for remote processing and collection

  • SDR (multiple) sends audio clips to MQTT
  • webserver and processing server subscribe to raw audio clip feed
  • webserver
    • archives audio clips and metadata for long term storage
    • updates metadata in database
    • service UI and API
  • processing engine
    • preemptable and only runs when processing power is available
    • subscribes to live data from MQTT for processing
    • queries webserver for old data needing processing
    • when an event is recognized (and metadata generated and samved), the event is sent to mqtt (even if late)
  • mqtt clients
    • subscribe to events and/or data
    • must check timestamp to determine if they still care about potentially late events
    • format and relay items to external services (meshtastic, discord, repeater)

web API[edit]

Note: authentication needed for all of these

receive data
frequency, timestamp (key on all data)
audio sample (from sdr)
metadata: text, fingerprints (from processing)
query for work
list of samples needing processing
query sample
return metadata
query sample data
return raw data
update metadata

schema[edit]

audio sample
frequency, timestamp (key)
source SDR (label)
data
text
voice fingerprint?
voice call sign
radio call sign
call sign
person or club (flag)
person name
person handle (tactical call)
voice fingerprint
radio fingerprints?

Database[edit]

(assuming clips are divided up into separate operators / transmissions before being added to the db)

clip
clipID (uint, simple serial number) <- primary key
timestamp (timestamp)
frequency (uint, in kHz?)
sdrID (tinyInt)
matchedFingerprintID (uint)
doubleTransmission (bool, true if clip has overlapping transmissions)
processed (bool)
text (longtext)
operator (FOREIGN KEY operatorID)
radio (FOREIGN KEY radioID)
operators
operatorID (uint, simple serial number) <- primary key
callsign (tinytext)
name (tinytext)
handle(tinytext)
firstHeard (FOREIGN KEY clipID)
fingerprint (FOREIGN KEY fingerprintID)

(to do: add tables for fingerprints, radios, sdrs)

processed events sent to mqtt[edit]

All events include original frequency, timestamp metadata, some events may be late.

  • event heard, metadata
  • text from voice recognition
  • voice recognized
  • radio recognized
  • call sign recognized

Training and analysis issues[edit]

  • bias in data: male / female
  • supervised labeling
  • unsupervised labeling from transcripts / check correlation for errors
  • noise in sample --> issues in transcription
  • labeled and unlabled samples

External references[edit]