I recently wrote an article about using Awk to make sense out of an Nginx access log. But Awk isn’t limited to just sysadmin things; anytime I need to do bulk processing of text, I usually make the first pass with Awk.
You’re right, some (many?) times awk isn’t the right choice — maybe the project has too many edge-cases to consider, or maybe there are too many unknowns that need to be abstracted away. In those cases, turning to a different language with robust libraries makes sense.
On the other hand, the one thing I don’t like about turning to another language is all the scaffolding that’s required to just get working. Usually, to get started, you create a project, initialize a virtual environment, define and install dependencies, mess with configuration because something is updated, and on and on.
Because awk is likely already installed on the system you’re working on, for many problems, you’re just a couple of minutes away from a solution.
CSV of Nexo Transactions
So that’s where I found myself the other night — I needed to convert a CSV file of cryptocurrency transactions from Nexo into a format that Ledger-cli understands.
More specifically, how do I turn something like this:
Transaction,Type,Currency,Amount,USD Equivalent,Details,Outstanding Loan,Date / Time
NXTnxxxxxxxxx,Interest,ADA,0.06513149,$0.14267261305268,approved / ADA Interest Earned,$0.00,2021-08-16 01:00:04
NXTGxxxxxxxxx,Interest,ADA,0.06887155,$0.15001153355925,approved / ADA Interest Earned,$0.00,2021-08-15 01:00:03
NXTcxxxxxxxxx,Interest,ADA,0.06511621,$0.13219163652648,approved / ADA Interest Earned,$0.00,2021-08-14 01:00:03
NXTaxxxxxxxxx,Exchange,USDC/ADA,-1166.227927 / +571.00203394,$1165.3789130691,approved / Exchange USD Coin to Cardano,$0.00,2021-08-13 15:39:29
NXTDxxxxxxxxx,Deposit,USDC,1166.227927,$1187.6814460844,approved / 0xc000000000000000000000000000000000000000000000000000000000000000,$0.00,2021-08-12 20:48:22
Into this:
2021-08-16 01:00:04 Interest -- approved / ADA Interest Earned
Assets:Current:Nexo:ADA 0.06513149 ADA
Income:Interest:Nexo:ADA -$0.14267261305268
2021-08-15 01:00:03 Interest -- approved / ADA Interest Earned
Assets:Current:Nexo:ADA 0.06887155 ADA
Income:Interest:Nexo:ADA -$0.15001153355925
2021-08-14 01:00:03 Interest -- approved / ADA Interest Earned
Assets:Current:Nexo:ADA 0.06511621 ADA
Income:Interest:Nexo:ADA -$0.13219163652648
2021-08-13 15:39:29 Exchange -- approved / Exchange USD Coin to Cardano
Assets:Current:Nexo:ADA 571.00203394 ADA
Assets:Current:Nexo:USDC -1166.227927 USDC
2021-08-12 20:48:22 Deposit -- approved / 0xc000000000000000000000000000000000000000000000000000000000000000
Assets:Current:Nexo:USDC 1166.227927 USDC
Equity:Transfers -$1187.6814460844
Quick Conversion Using Awk
The actual process of writing the script took a few minutes, the result being this bash script:
#!/bin/bash
awk -F "," '{
if ($2 ~ /Interest/)
{print $8 " " $2 " -- " $6 "\n\tAssets:Current:Nexo:" $3 " " $4 " " $3 "\n\tIncome:Interest:Nexo:" $3 " -" $5; print "\n";}
if ($2 ~ /Deposit/)
{print $8 " " $2 " -- " $6 "\n\tAssets:Current:Nexo:" $3 " " $4 " " $3 "\n\tEquity:Transfers -" $5; print "\n";}
if ($2 ~ /Exchange/)
{
split($3,a,"/"); # split into the two currencies
split($4,b,"/"); # split into the two amounts
gsub(/\+/,"",b[2])
print $8 " " $2 " -- " $6 "\n\tAssets:Current:Nexo:" a[2] " " b[2] " " a[2] "\n\tAssets:Current:Nexo:" a[1] " " b[1] " " a[1]; print "\n";}
}' nexo_transactions.csv > nexo.ada.ledger
The few things I’ll call out since it wasn’t completely obvious when I started:
- By default awk’s field separator is a space. Since this is a CSV file, we want to use a comma instead. This is the
-F ","
part of the above script. - Because we want slightly different behavior depending on the type of transaction, we need to use
if
statements that test if a the second (ie the “type”) field is the one we want. - The various “exchange” transactions had additional info packed into the
Currency
andAmount
fields. Basically there were two values that we needed to extract from the one field. Because of this, we use thesplit(<field>, <local-variable>, <delimiter>)
function. - Lastly, I didn’t know until working through this exercise, but Ledger-cli doesn’t like to have a plus preceeding positive amounts. Because Nexo is explicit in adding the plus, I added the
gsub(<search-regex>, <replacement>, <string>)
function to replace the plus in the “split” amount field.