Creating Co-occurrence Data
There are several ways to create co-occurrence data for chord diagrams.
From DataFrames
The most common approach is to use a DataFrame where each row represents an observation:
df = DataFrame(
Group1 = ["A", "A", "B", "B", "C"],
Group2 = ["X", "Y", "X", "Y", "X"],
Group3 = ["1", "1", "2", "2", "1"]
)
cooc = cooccurrence_matrix(df, [:Group1, :Group2, :Group3])The function automatically:
- Extracts unique labels from each column
- Groups labels by their source column
- Counts co-occurrences between labels from different groups
- Creates a symmetric co-occurrence matrix
From Raw Matrices
For more control, create a CoOccurrenceMatrix directly:
matrix = [10 5 2; 5 8 3; 2 3 6]
labels = ["A", "B", "C"]
groups = [
GroupInfo{String}(:Group1, ["A", "B"], 1:2),
GroupInfo{String}(:Group2, ["C"], 3:3)
]
cooc = CoOccurrenceMatrix(matrix, labels, groups)Handling Missing Values
Missing values are automatically skipped:
df = DataFrame(
A = ["a1", missing, "a2"],
B = ["b1", "b1", missing]
)
cooc = cooccurrence_matrix(df, [:A, :B]) # Handles missing gracefullyNormalization
Convert counts to frequencies:
cooc = cooccurrence_matrix(df, [:V, :D, :J]; normalize=true)
# or
cooc_norm = normalize(cooc)