Message-ID: <534965544.11933.1606416988246.JavaMail.confluence@wiki.cns.iu.edu> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_11932_1162373299.1606416988245" ------=_Part_11932_1162373299.1606416988245 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html Multipartite Joining

# Multipartite Joining

###### Description
=20

Multipartite Joining is a form of graph projection with attribute aggreg= ation. One partite (or semi-partite) set of nodes, selected by a type attri= bute, is projected across another (semi-) partite set of nodes.

=20

The new graph only has nodes from the first set, and no edges originally= in the graph. Edges are created between nodes in the new graph if they sha= red neighbors that were in the second set. New attributes are created on no= des in the new graph and on edges based on rules for aggregating the attrib= utes on neighbor nodes (shared neighbor nodes, in the case of attributes cr= eated on edges).

=20

For instance, imagine a graph of four nodes. Nodes A and B are nodes in = the first set. Node C is connected to both A and B, and node D is only conn= ected to A. Perhaps nodes A and B are people, and nodes C and D are grants.= Each grant is for \$10,000.

=20

We can do multipartite joining, selecting those sets (via some type attr= ibute), with rules as follows:

=20
=20
```grant_money_received =3D grant_amount.total
edge.grant_money_shared =3D grant_amount.total
```
=20
=20

The result would be a new graph with just nodes A and B. There would be = an edge between A and B. Node A, since it has neighbors with total grant_am= ount \$20,000, would have a grant_money_received attribute valued \$20,000. N= ode B, since it only has neighbors with total grant_amount \$10,000, would h= ave a grant_money_received attribute valued \$10,000. Since they share one \$= 10,000 grant, the edge between them would have a grant_money_shared attribu= te valued \$10,000.

=20

The available operations are sum, average, max, min, mode, and count.=20

###### Pros & Cons
=20

This algorithm is very flexible, but slow. It requires a solid understan= ding of the underlying data to make sensible choices of aggregation functio= ns.

=20

Much of the slowness is because the flexibility in aggregation functions= makes it difficult to use fast matrix operations and remain memory efficie= nt.

=20
###### Implementation= Details
=20

The aggregation rules must be specified, as above, in a metadata file pr= ovided to the algorithm (which is just a java properties file with the appr= opriate keys and values, and so might be called something like "metadata.pr= operties").

=20