Using Awk to Convert CSV Crypto Transactions into Ledger-cli entries

I recently wrote an article about using Awk to make sense out of an Nginx access log. But Awk isn’t limited to just sysadmin things; anytime I need to do bulk processing of text, I usually make the first pass with Awk.

You’re right, some (many?) times awk isn’t the right choice — maybe the project has too many edge-cases to consider, or maybe there are too many unknowns that need to be abstracted away. In those cases, turning to a different language with robust libraries makes sense.

On the other hand, the one thing I don’t like about turning to another language is all the scaffolding that’s required to just get working. Usually, to get started, you create a project, initialize a virtual environment, define and install dependencies, mess with configuration because something is updated, and on and on.

Because awk is likely already installed on the system you’re working on, for many problems, you’re just a couple of minutes away from a solution.

CSV of Nexo Transactions

So that’s where I found myself the other night — I needed to convert a CSV file of cryptocurrency transactions from Nexo into a format that Ledger-cli understands.

More specifically, how do I turn something like this:

Transaction,Type,Currency,Amount,USD Equivalent,Details,Outstanding Loan,Date / Time
NXTnxxxxxxxxx,Interest,ADA,0.06513149,$0.14267261305268,approved / ADA Interest Earned,$0.00,2021-08-16 01:00:04
NXTGxxxxxxxxx,Interest,ADA,0.06887155,$0.15001153355925,approved / ADA Interest Earned,$0.00,2021-08-15 01:00:03
NXTcxxxxxxxxx,Interest,ADA,0.06511621,$0.13219163652648,approved / ADA Interest Earned,$0.00,2021-08-14 01:00:03
NXTaxxxxxxxxx,Exchange,USDC/ADA,-1166.227927 / +571.00203394,$1165.3789130691,approved / Exchange USD Coin to Cardano,$0.00,2021-08-13 15:39:29
NXTDxxxxxxxxx,Deposit,USDC,1166.227927,$1187.6814460844,approved / 0xc000000000000000000000000000000000000000000000000000000000000000,$0.00,2021-08-12 20:48:22

Into this:

2021-08-16 01:00:04 Interest -- approved / ADA Interest Earned
	Assets:Current:Nexo:ADA    0.06513149 ADA
	Income:Interest:Nexo:ADA    -$0.14267261305268

2021-08-15 01:00:03 Interest -- approved / ADA Interest Earned
	Assets:Current:Nexo:ADA    0.06887155 ADA
	Income:Interest:Nexo:ADA    -$0.15001153355925

2021-08-14 01:00:03 Interest -- approved / ADA Interest Earned
	Assets:Current:Nexo:ADA    0.06511621 ADA
	Income:Interest:Nexo:ADA    -$0.13219163652648

2021-08-13 15:39:29 Exchange -- approved / Exchange USD Coin to Cardano
	Assets:Current:Nexo:ADA     571.00203394 ADA
	Assets:Current:Nexo:USDC    -1166.227927  USDC

2021-08-12 20:48:22 Deposit -- approved / 0xc000000000000000000000000000000000000000000000000000000000000000
	Assets:Current:Nexo:USDC    1166.227927 USDC
	Equity:Transfers    -$1187.6814460844

Quick Conversion Using Awk

The actual process of writing the script took a few minutes, the result being this bash script:

awk -F "," '{
if ($2 ~ /Interest/)
   {print $8 " " $2 " -- " $6 "\n\tAssets:Current:Nexo:" $3 "    " $4 " " $3 "\n\tIncome:Interest:Nexo:" $3 "    -" $5; print "\n";}
if ($2 ~ /Deposit/)
   {print $8 " " $2 " -- " $6 "\n\tAssets:Current:Nexo:" $3 "    " $4 " " $3 "\n\tEquity:Transfers    -" $5; print "\n";}
if ($2 ~ /Exchange/)
   split($3,a,"/");  # split into the two currencies
   split($4,b,"/");  # split into the two amounts
   print $8 " " $2 " -- " $6 "\n\tAssets:Current:Nexo:" a[2] "    " b[2] " " a[2] "\n\tAssets:Current:Nexo:" a[1] "    " b[1] " " a[1];     print "\n";}
}' nexo_transactions.csv > nexo.ada.ledger

The few things I’ll call out since it wasn’t completely obvious when I started:

  • By default awk’s field separator is a space. Since this is a CSV file, we want to use a comma instead. This is the -F "," part of the above script.
  • Because we want slightly different behavior depending on the type of transaction, we need to use if statements that test if a the second (ie the “type”) field is the one we want.
  • The various “exchange” transactions had additional info packed into the Currency and Amount fields. Basically there were two values that we needed to extract from the one field. Because of this, we use the split(<field>, <local-variable>, <delimiter>) function.
  • Lastly, I didn’t know until working through this exercise, but Ledger-cli doesn’t like to have a plus preceeding positive amounts. Because Nexo is explicit in adding the plus, I added the gsub(<search-regex>, <replacement>, <string>) function to replace the plus in the “split” amount field.