[00:00:00.000 --> 00:00:03.680] Good data analysis relies on good data classification.
[00:00:03.680 --> 00:00:07.840] But how do you trust your AI to apply classification rules reliably?
[00:00:07.840 --> 00:00:10.880] Today we're talking about attention heads,
[00:00:10.880 --> 00:00:14.240] the AI components that decide which words to focus on.
[00:00:14.240 --> 00:00:16.640] They learn your rules literally.
[00:00:16.640 --> 00:00:20.320] Incomplete rules lead to misclassified results.
[00:00:20.320 --> 00:00:24.880] But first, if you want AI that understands context,
[00:00:24.880 --> 00:00:28.640] visit ask-y.ai and try Prism.
[00:00:29.760 --> 00:00:33.200] January 1st. New year, new me.
[00:00:33.200 --> 00:00:36.320] Time to take stock of last year's spending.
[00:00:36.320 --> 00:00:41.200] Help me classify last year's expenses into personal, work, or health.
[00:00:41.200 --> 00:00:43.200] Here are some examples.
[00:00:43.200 --> 00:00:48.560] Now classify all last year expenses.
[00:00:48.560 --> 00:00:56.000] Annual physical health.
[00:00:58.240 --> 00:01:00.080] Perfect. Hairdresser.
[00:01:00.080 --> 00:01:03.280] Coffee with client.
[00:01:03.280 --> 00:01:11.280] Personal. Wait, it said client. That's work.
[00:01:11.280 --> 00:01:14.080] What happened?
[00:01:14.080 --> 00:01:17.680] The floofies wear glasses with many lenses.
[00:01:17.680 --> 00:01:19.680] These are called attention heads.
[00:01:19.680 --> 00:01:22.640] Some lenses track word position.
[00:01:22.640 --> 00:01:24.320] Others track word relationships.
[00:01:25.040 --> 00:01:31.200] When classifying coffee with client, a position-focused lens spotted coffee
[00:01:31.200 --> 00:01:35.200] as the anchor word, the first word in the phrase, and gave it extra weight.
[00:01:35.200 --> 00:01:38.960] A relationship-focused lens matched coffee to Marcus's example.
[00:01:38.960 --> 00:01:42.400] Coffee. Personal. Strong match.
[00:01:42.400 --> 00:01:47.680] Other lenses noticed client and found dinner with client equals work.
[00:01:47.680 --> 00:01:50.560] But client wasn't an anchor position.
[00:01:50.560 --> 00:01:54.320] The stronger pattern won. Confident but wrong.
[00:01:54.320 --> 00:01:57.760] When attention heads compete, position often wins.
[00:01:57.760 --> 00:02:00.880] You can't anticipate every edge case with examples alone.
[00:02:00.880 --> 00:02:02.240] Here's what to do instead.
[00:02:02.240 --> 00:02:03.920] Show contrasting examples.
[00:02:03.920 --> 00:02:06.640] Coffee. Personal. Coffee with client. Work.
[00:02:06.640 --> 00:02:10.240] Add rules. If a transaction mentions client, always classify as work.
[00:02:10.240 --> 00:02:11.600] Define exclusions.
[00:02:11.600 --> 00:02:14.640] The word gym only triggers health when it's an activity.
[00:02:14.640 --> 00:02:18.320] Gym membership equals health. Gym bag equals personal.
[00:02:18.320 --> 00:02:23.040] Without priority rules, attention heads compete and the strongest pattern wins.
[00:02:23.600 --> 00:02:26.080] Often based on word position, not meaning.
[00:02:26.080 --> 00:02:31.360] Explicit rules tell the lenses which words should override everything else.
[00:02:31.920 --> 00:02:35.680] Want AI that handles classification nuance automatically?
[00:02:35.680 --> 00:02:39.760] Visit askwhy.ai and try Prism.
[00:02:39.760 --> 00:02:43.840] So your Coffee with Client always lands in the right category.
[00:02:43.840 --> 00:02:47.760] Happy New Year, AI analysts!