In our talk presented here we listed several problems involved in scaling democracy, one of them was the problem of having voters decide from many possibilities, which we referred to with

I can’t decide from too many options

Information overload “the difficulty a person can have understanding an issue and making decisions that can be caused by the presence of too much information.” (Toffler)

Here we will present an idea towards fixing this problem. Briefly, the aim is to combine ideas from collaborative filtering with liquid democracy. Wikipedia on collaborative filtering:

The growth of the Internet has made it much more difficult to effectively extract useful information from all the available online information. The overwhelming amount of data necessitates mechanisms for efficient information filtering. One of the techniques used for dealing with this problem is called collaborative filtering.

This description fits very well with our problem. If you haven’t heard of collaborative filtering before, think amazon/lastfm/netflix recommendations, or digg filtering of stories. Here’s a quote from Anton Kast, lead scientist at Digg

The idea of collaborative filtering is simply to combine the input from many different people to filter information better than would otherwise be possible. In particular, you use information from many independent judgements by many people… On Digg anyone can submit a story. And anyone can vote on any story – that’s the filtering part, and whatever is most popular wins. It’s a giant collaborative filter in the simplest, classic sense

With that out of the way, how do we combine collaborative filtering with liquid democracy? The answer is simple, whereas in classic collaborative filtering user affinities are implicit in user behaviour and are inferred statistically by CF systems, liquid filtering makes user affinities explicit, in the form of delegation. After all, **‘user affinity’ is nothing but ‘delegation’ in different clothes.** And as in liquid democracy, transitivity can be mixed in; the interpretation is straightforward: If I trust A’s judgement who trusts B’s judgement, then I trust B’s judgement.

So, let’s get back to the probem we started with. We have a situation where we have to collectively choose among many options. In the presentation this was referred to as *scaling the solution space*. It is infeasible to apply voting as is, because voters cannot consider all these options to make a judgement. So what we do is to **distribute the cognitive load in a way that reflects user delegation**. The problem of liquid filtering is the assignment of voters to questions according to delegation choices, in an optimal way.

### Where’s the math?

Here’s a simple formalization. Note that things can be made more sophisticated to reach specific criterions of optimality. For example, one objective could be to reduce transitive chains lengths such that user delegates are in fact aligned with the user’s judgement. But in this post we will restrict optimality to mean:

minimize the number of questions that each voter has to answer while ensuring that every voter votes on each question

Some definitions follow. A set of questions to be made

**Q** = { q_{1}, q_{2}, q_{3} … }

A set of voters to make a decision

**V** = { v_{1}, v_{2}, v_{3} … }

Voter delegation choices define a binary relationship over **V**

**D** = { (v, v’) | v,v’ ∈ **V** }

The transitive closure of **D** is defined as **D ^{+}**. It encodes transitive delegation information, we will use it shortly. Now, what we want is to produce an assignment of voters to questions. Assignment is a binary relationship over

**V**and

**Q**

**A** = { (v, q) | v ∈ **V, **q** ∈ Q** }

However, assignment as is does not consider delegation. So we define a transitive assigment set **T** according to the following set comprehension

**T** = { (v, q) | ** **∃ (v’, q) ∈ **A** ∧ ∃ (v, v’) ∈ **D ^{+}** ∧ v, v’ ∈

**V**, q

**∈**

**Q**}

**T** is the meat of liquid filtering. It encodes the fact that although voters may not answer a question directly, they can answer it by virtue of having one of their delegates answer it for them. In this way T is the liquid filtering equivalent of a liquid democracy algorithm that walks the nodes to determine a user’s vote when that user does not vote directly.

We can finally state the aim of liquid filtering in terms of our definitions. Recall:

minimize the number of questions that each voter has to answer while ensuring that every voter votes on each question

which we can decompose into

minimize the number of questions that each voter has to answer

minimize |**A**|

and

while ensuring that every voter votes on each question

∃ (v, q) ∈ **T** ** **∀ q** **∈** Q**, v ∈** V**

In short

minimize |**A**| subject to ∃ (v, q) ∈ **T** ** **∀ q** **∈** Q**, v ∈** V**

How does this address the initial problem? The answer is that by virtue of delegation (**T**), we have:

|**A**| ≤ |**V**| x |**Q**|

The number of questions necessary to make a decision is less than that if all voters had to decide one every single question (hence this slide). With no delegation, the above equation reduces to equality. The degree to which the reduction of questions occurs is hard to foresee. It depends on how many delegation choices are made, and how those choices are made. Different delegation graphs can lead to different results. Then there’s the matter of implementation.

**Where’s the code?**

What remains is to write an algorithm that obtains **A** from the inputs **V**, **Q** and **D**. This algorithm should be optimal in that it minimizes |**A**|, but other optimality constraints can be added. One is delegation chain length[1]. Another is a threshold such that even if the sum total of voters is minimzed, one does not want to overload specific voters with too many questions. There are many possibilites.

Two other considerations to note.

First, in the above we have assumed that all voters must have a choice for all questions. If we relax this constraint, and use statistics to infer probable choices, we are talking about sampling[2], as in a poll. But even if we are doing sampling, delegation information can still be used to improve the quality of the results, or equivalently reduce the number of questions necessary for a given quality.

Second, it could be possible to write an algorithm that adaptively calculates assignments of voters to questions using results as they become available. This would probably use statistical inference to determine which questions can lead to a final result more quickly, something along the lines of bayesian experimental design comes to mind.

[1] We could also define a distance matrix over V as specified by D. This would encode the distance between voters in terms of the delegation jumps. The distance matrix is a useful if we are interested in limting delegation chains as described earlier

[2] See appgree