The night of the 2022 Australian federal election, a friend of mine shared a link to the media feed provided by the Australian Electoral Commission (AEC). The purpose the feed is to provide a live feed of election results on the night of the election. The intended audience of the feed itself is for media organisations and third parties. They provide historical data to use to generate predictions and offer comparisons to past results for a particular electoral division.

As I was only told about it on election night there was no time to build a system that attempted to mimic the live tally that news organisations produce. The idea that came to mind is produce a video stream that mimics that of the televisions stations without the clips of humans standings in front of screens. I had considered collecting the data for that election as each revision of the data is provided and thus it would be possible to ‘replay’ the tally. Possibly so someone could use it on a Twitch or YouTube Live stream to provide their own election night tally. THere was however no where near enough time to build that.

I went over the data and there were pieces of information that stood out for me that I wanted to focus on. In this case it was the preload data. Essentially data that a media organisation would want to load into their system before the tallying/counting of votes begin. Mostly around who the candidates are for each electoral division.

The two key pieces that stood out were:

  • Contact address - email
  • Profession (Occupation)

I discovered the email addresses for candidates were not included in the data for the 2022 election, which was unfouraranelty as I had been focusing on the the previous election in 2019.

The format for the files are Election Markup Language Version 5 which is an OASIS Standard built on top of Extensible Markup Language (XML). In case you are curious and to save you performing a web search, OASIS stands of Organization for the Advancement of Structured Information Standards.

A high level look of this is as follows:

<EML>
    TransactionId
    Count
        EventIdentifier
            EventName
        Election
            ElectionIdentifier
            Contests
                Contest (0 or more)
</EML>
  • EventIdentifier
    • Attribute: Id
      • Example: 24310
    • EventName
      • Example: 2019 Federal Election
  • ElectionIdentifier
    • Attribute: Id
      • Example: 24310
    • Element: ElectionName
      • Example: House of Representatives Election
    • Element: ElectionCategory
      • Example: House

For example, the event may be the 2019 Federal Election, and the two elections taking place are: • House of Representatives Election • Senate Election

Therefore the candidates are broken down by Election (Senate or House) then by Contest (State or Electoral Division). I haven’t broken down the information into the seperate elections and instead merged the two elections together.

The preload data here is structured as: CandidateList -> Election -> Contest -> Candidate

Sample candidates:

Candidate(identifier='32624', name='CHEHOFF, Michael',
          affiliation=Affiliation(identifier='1330', short_code='AFN', name='Australia First Party'))
Candidate(identifier='33656', name='McLERNON, Peter',
          affiliation=Affiliation(identifier='1469', short_code='UAPP', name='United Australia Party'))
Candidate(identifier='33469', name='CHANG, Tshung-Hui',
          affiliation=Affiliation(identifier='1133', short_code='ON', name="Pauline Hanson's One Nation"))
Candidate(identifier='33388', name='BEAZLEY, Hannah',
          affiliation=Affiliation(identifier='197', short_code='ALP', name='Australian Labor Party'))

The top 20 domains in 2019 for candidates for federal election were:

  • 170 unitedaustraliaparty.org.au
  • 149 gmail.com
  • 118 aph.gov.au
  • 44 vic.alp.org.au
  • 42 vic.greens.org.au
  • 35 onenation.com.au
  • 33 sustainableaustralia.org.au
  • 30 nsw.greens.org.au
  • 27 vic.liberal.org.au
  • 26 nswliberal.org.au
  • 25 bigpond.com
  • 24 queenslandlabor.org
  • 21 onenationhq.com.au
  • 20 hotmail.com
  • 18 walabor.org.au
  • 17 nswlabor.org.au
  • 16 lnpq.org.au
  • 15 conservatives.org.au
  • 14 outlook.com
  • 14 westernaustraliaparty.org.au

A follow-up for this would be to focus on the bigger parties and check the domain against the party name, to see how many Liberal candidates weren’t using a Liberal domain. Returning candidates typically seem to use the apch.gov.au domain for the Parliament of Australia.

The top 20 professions in 2019 for candidates for federal election were:

  • 101 Member of Parliament
  • 63 Retired
  • 38 Self Employed
  • 35 Student
  • 29 Business Owner
  • 22 Senator
  • 21 Teacher
  • 20 Lawyer
  • 19 Manager
  • 16 Unemployed
  • 16 Company Director
  • 15 Engineer
  • 14 Consultant
  • 14 Farmer
  • 10 Electrician
  • 10 Director
  • 10 Solicitor
  • 9 Councillor
  • 9 Project Manager
  • 9 Not Employed

The top 20 professions in 2022 for candidates for federal election were:

  • 85 Member of Parliament
  • 57 Retired
  • 45 Self Employed
  • 35 Student
  • 33 Business Owner
  • 32 Unemployed
  • 28 Consultant
  • 26 Federal Member of Parliament
  • 26 Senator
  • 23 Lawyer
  • 22 Manager
  • 20 Director
  • 20 Teacher
  • 16 Solicitor
  • 14 Councillor
  • 13 Engineer
  • 13 Farmer
  • 12 Small Business Owner
  • 12 Project Manager
  • 11 Accountant

The Python script I wrote for this is available from GitHub.

The other thing I worked on was converting the non-preload data into the parquet format. I am yet to use that for anything.

Closing

A great video that covers how media organisations derives the swing for the election and where their confidence in being able to ‘call’ the election comes from. In this case the term ‘call’ is being used to mean to declare a winner, where as typically before an election, the act of calling an election means to say one is going to happen soon and when.

“Election Night Analysis: Art or Science” - Antony Green (LCA 2022 Online)

I am hoping to revisit the election media feed data in the future. Hopefully to the point where I can make video from the data for tally, even if it doesn’t cover the whole election. Of course for best results would need to also generate the swing and estimation to give it the true television feel.