Getting started

Ensure that piargus and TauArgus are installed. If both are installed, you can start by importing piargus along with pandas:

import pandas as pd
import piargus as pa

Loading data

There are two primary ways to use piargus: starting from microdata or table data. In both cases, your data must be in the form of a pandas Dataframe. If your data is stored in a CSV file, it can be loaded using pd.read_csv().

input_df = pd.read_csv('data.csv')

For more options to load data, consult the pandas documentation.

Starting from Microdata

First, convert your input_df into a microdata-object:

input_data = pa.MicroData(input_df)

If any columns are hierarchical, specify them. For example, if regio is hierarchical and its hierarchy is stored in a file regio.hrc, you can load the hierarchy as follows:

regio_hierarchy = pa.TreeHierarchy.load_hrc("regio.hrc")

input_data = pa.MicroData(
    input_df,
    hierarchies={"regio": regio_hierarchy},
)

Setting up a Table

Set up a table with sbi and regio as explanatory variables and income as the response variable. Use the p%-rule as a safety rule and OPTIMAL as a method for secondary suppression:

output_table = pa.Table(explanatory=['sbi', 'regio'],
                        reponse='income',
                        safety_rule="P(10)",
                        suppression_method=pa.OPTIMAL)

Running the Job

To run the table generation job with TauArgus:

tau = pa.TauArgus(r'<Insert path to argus.exe here>')
job = pa.Job(input_data, [output_table], directory='tau', name="my-microdata")
report = tau.run(job)
table_result = output_table.load_result()

print(report)
print(table_result)

table_result.dataframe().to_csv('output/microdata_result.csv')

The output will look like this:

<ArgusReport>
status: success <0>
batch_file: tau\basic-example.arb
workdir: tau\work\basic-example
logbook_file: tau\basic-example_logbook.txt
logbook:
        25-Aug-2023 16:49:24 : <OPENMICRODATA> "tau\input\basic-example_microdata.csv"
        25-Aug-2023 16:49:24 : <OPENMETADATA> "tau\input\basic-example_microdata.rda"
        25-Aug-2023 16:49:24 : <SPECIFYTABLE> "symbol""regio"|"income"||
        25-Aug-2023 16:49:24 : <SAFETYRULE> P(10, 1)
        25-Aug-2023 16:49:24 : <READMICRODATA>
        25-Aug-2023 16:49:24 : Start explore file: tau\input\basic-example_microdata.csv
        25-Aug-2023 16:49:24 : Start computing tables
        25-Aug-2023 16:49:24 : Table: symbol x regio | income has been specified
        25-Aug-2023 16:49:24 : Tables have been computed
        25-Aug-2023 16:49:24 : Micro data file read; processing time 0 seconds
        25-Aug-2023 16:49:24 : Tables from microdata have been read
        25-Aug-2023 16:49:24 : <SUPPRESS> OPT(1)
        25-Aug-2023 16:49:25 : End of Optimal protection. Time used 0 seconds
                               Number of suppressions: 4
        25-Aug-2023 16:49:25 : <WRITETABLE> (1, 2, AS+, "tau\output\basic-example_table-1.csv")
        25-Aug-2023 16:49:25 : Table: symbol x regio | income has been written
                               Output file name: tau\output\basic-example_table-1.csv
        25-Aug-2023 16:49:25 : End of TauArgus run

Response: income
                      safe status  unsafe
symbol regio
Total  Total        264.43      S  264.43
        ExampleDam       x      M  141.57
       ExampleCity       x      M  122.86
A      Total             x      M  142.59
        ExampleDam       x      U   93.13
       ExampleCity       x      U   49.46
C      Total             x      M  121.84
        ExampleDam       x      U   48.44
       ExampleCity       x      U   73.40

Interpreting Status codes

Status	Meaning
S	Safe
P	Protected
U	Unsafe by primary suppression
M	Unsafe by secondary suppression
Z	Empty cell

Starting from TableData

To work with tabular data, convert input_df into a TableData object:

input_data = pa.TableData(
    input_df,
    explanatory=["activity", "size"],
    reponse="value",
    safety_rule="P(10)",
    suppression_method=pa.OPTIMAL,
)

You can also specify additional parameters to TableData:

Parameter	Meaning	Example
`total_codes`	Total code for each explanatory variable.	`{"regio": "US"}`
`frequency`	Column with number of contributors to response.	`"n_obs"`
`top_contributors`	Columns with top contributors.	`["max", "max2", "max3"]`

To run the data protection job:

job = pa.Job(table, directory='tau', name='my-table-data')
result = tau.run(job)
table_result = table.load_result()

print(result)
print(table_result)
table_result.dataframe().to_csv('output/tabledata_result.csv')