In ChatGPT as Vintage Watch Dealer? Part 1 I wrote about the possibility of artificial intelligence tools such as ChatGPT (hereon CGPT) to replace vintage watch dealers. Although it is too early for AI to replace vintage watch dealers, we seem to be on a trajectory whereby AI can at least assist in the higher value added tasks such as identification, authentication and pricing of vintage watches. In Part 2 we will see if CGPT can authenticate a complicated watch with non genuine parts, but I’m still working on that post. Instead, for this post we will see if AI can replace…..me!
Vintage Watch Auction Writer
What is a Vintage Watch Auction Writer? It is loosely defined as someone who writes about vintage watches in auction. For example, Eric Wind and Isaac Wingold used to write the “Bring a Loupe” column in Hodinkee, covering both auctions and full price listings. Other writers also regularly review watches in branded auctions such as Christie’s and Philips. I guess we would be the watch industry’s equivalent of sports writers, reviewing stats (prices, specs), breaking news of and building up excitement for game (auction) day. It involves the following mundane tasks:
Reviewing auction catalogues
Scrutinizing photos to assess condition, authenticity and originality
Estimating market value and evaluating reserve price
Research trends and only select lots that would be of interest to readership
Writing concise commentary for each lot selected before a weekly deadline
Each writer has their own area of interest and expertise, in addition to being limited to what is on offer by the auction houses. Hence task 4 is quite difficult to do consistently. And as I’m finding out now, it takes a significant amount of time to do tasks 1-3 and 5, as it involves reviewing hundreds of lots a day, exchanges with auction houses, drafting, editing and re-editing commentary to be more concise. Add to this that I’m also deathly afraid of missing a potential bargain auction or recommending a suspicious lot!
ChatGPT to the Rescue?
For the Auction Highlights: October 25, 2023 Bukowski’s post, I reviewed the full catalogue of Bukowski’s Timepieces 651 auction and selected 6 lots I felt are within the Strikezone of this Substack and hence would be of interest to subscribers. This actually took me quite some time, in total approximately 4-5 hours, including some back and forth with the auction house and cross-checking photos with my own database of previously sold watches.
What if I fed the Bukowski’s auction catalogue to CGPT, and asked it to select 6 lots and write the column for me? The only parameters I gave it were that the low estimate had to be <$25,000 and production year 1985 or prior. Which one would it select? Let’s find out.
Without any training and with only 2 parameters, CGPT selected 2 of the 6 lots that I had selected (the 2 Lemanias). I’m actually quite honored. However we don’t know how and why CGPT selected the lots that it did, and actually, its not very important. Why?
Here I want to remind you of the teacher-naive pupil analogy. The best way to interact with CGPT is consider it your naive Pupil, and you as Teacher should nudge, or guide it along to get what you need and learn along the way. Insufficient guidance will give you naive answers like I got above (no disrespect to Longines, UG or Eterna collectors, but they just weren’t the standout lots in the catalogue).
So in order to guide our pupil we need to teach it a set of criteria on how to select lots in auction catalogues. I drafted my own criteria customized for this Substack, fed it to CGPT and asked it to select 6 lots again. (I’m not going to post my criteria here because its kind of boring but feel free to ask). I also asked CGPT to write a concise description of each lot as if it were a writer for Hodinkee. Which lots would it pick?
Using my criteria as a guide, CGPT chose the 5 out of the 6 lots I had selected. Instead of the Swedish Army issued Lemania TG195 (Lot 28), CGPT had chosen the Vacheron & Constantin Ref 4718 (Lot 3) . When queried why it didn’t choose the Lemania, its reason was that “it is missing the crown”, which CGPT had read in the catalogue description. I know it also prioritized lots with higher estimates, because I had instructed it to do so in my criteria. The reason I hadn’t included the Vacheron was that there were several watchmaker’s marks on the dial that I felt detracted from its desirability.
CGPT also did an OK job of writing a concise description and reasoning to bid on these lots. I had asked it to synthesize information from Bukowski’s catalogue and information gleaned from online sources. It can use some polishing up, but I think it did ok for a first run with no guidelines on how to write them, other than using the “Hodinkee-style” of writing. Next time I will feed it all my old Auction Highlights posts and see if it can write in my style.
Am I Done Here?
CGPT did a pretty good job, so can I just set it up to do this automatically every week in my stead? As much as I would love to autopilot this Substack, unfortunately the answer is not yet. In fact, this exercise with CGPT took me LONGER than I normally would spend on an auction highlights post, for the following reasons:
Subpar Vision - CGPT is supposed to be able to “see” pictures, but it hallucinates often and is unable to see many important characteristics in a photo (even if high resolution). Although I fed it all the lot photos and asked it to see and review them, it only used the text information to select lots.
Inaccurate Application of Criteria - My criteria was a pretty simple list, but it still chose lots that obviously didn’t fit the criteria. For example, one of the criteria was to exclude lots dated after 1985, but it kept selecting one or two modern lots. I had to correct it many times to get to the final list that you see here. When queried why it chose a lot that didn’t adhere to the criteria, the reason was that “it was an oversight” or “it should have been included.” This was pretty frustrating. Structuring the criteria in a step by step, chronological decision-making waterfall order did not help either.
Erratic Memory - CGPT seems to remember some conversations and totally forget others. For example, after my correcting its selection several times, it sometimes reverts back to choosing lots that do not adhere to the criteria.
Hyper-Sensitive to Format of Data - Bukowski’s catalogue is fairly well formatted, but I still had to adjust the catalogue manually in order for CGPT to extract data from it. For example some lots had 2 pages of details while others only 1 page. This totally confused CGPT, and I grumbled and spent some time doing manual pdf adjusting work.
Difficult to Teach Process Knowledge - Lot 28, the Lemania TG195, is missing its crown. I know that this is a fairly easy fix and that it is possible to source a crown based on my past process knowledge and relationships with dealers and collectors. But how would we teach this to CGPT? Even if CGPT were to achieve super human intelligence, how would it know that my friend Martin (not his real name) somewhere in Scandinavia has spare crowns in his drawer?
Equal Weight Criteria Confusion - Some of the criteria have equal weights - for example, “prioritize stone dial Rolex Day-Dates” and “prioritize issued military chronographs”. Both are desirable and it is difficult to prioritize one over the other, even for humans. So CGPT the naive pupil gets confused here and needs its Teacher (me) to guide it.
Nevertheless, I’m quite optimistic about CGPT writing this Substack based on text information on my behalf in the future. Compared to traditional writing, where you gather your ingredients to build something from the ground up, writing with CGPT is more like sculpting - there is a mass of text data that you sort of shape into place.
CGPT Needs New Glasses? or We Need a New Paradigm?
When OpenAI announced in September that CGPT was multi-modal, meaning it has the ability to interpret pictures and photos, I was very excited. A picture can tell a thousand words, and high resolution auction pictures can save you thousands of dollars. It is essential for an auction writer to be able to scrutinize photos and identify problems with each lot. Could CGPT do this in my stead?
If you recall, CGPT did not select Lot 28, the Lemania TG195, partly due to its missing the crown. However CGPT had read that it was missing the crown in the text description. It had not seen the missing crown.
Could it be due to lack of training? I’m not going to attach all the screenshots, but I fed CGPT photos of an all original, excellent condition TG195 and asked it to study them and use them as a basis for analyzing all other specimens. I also provided text information about each photo, describing certain parts such as crowns, pushers etc. For example, referring to a case-side photo, I described the crown as a “round knob with knurled edges used to set the time, located in the middle.”
Next, I uploaded high resolution pictures of the TG195 from the Bukowski’s website and asked it to compare and advise the steps necessary to refurbish it to the same condition of the all-original, excellent condition TG195 above:
What I hoped to get as a response was:
Crown is missing
Sweep hand is discolored and should be repainted.
Bonus Point: Crystal is scratched and should be replaced, but the dial is OK
Super Bonus Point: Pusher discolored not dented
Instead I got this:
Again I’d like to remind you that CGPT is not an Oracle- as
explains, its a fuzzy processor that is better suited to guide in the right direction rather than provide accurate answers. (See ChatGPT as Vintage Watch Dealer? Part 1). But “Clean and possibly restore the dial” is the worst advice you can give a collector, as it is a surefire way to totally destroy the value of the watch. It also did not realize that the crown is missing. I nudged CGPT to compare the crowns again, unsuccessfully:So I had to make it sort of obvious:
Finally, CGPT acknowledges the possibility that the crown is missing, but is still uncertain. Its possible that CGPT was originally confusing the stem tube for the crown, but they are totally different shapes so it doesn’t reflect well on CGPT’s visual capabilities. I’m not going to attach screenshots, but CGPT was also unable to determine if the scratches were on the crystal or the dial, despite Bukowski’s photos being high resolution.
So is this inability to “see” photos very well due to subpar or insufficient training? Or could it be that it is difficult for CGPT to comprehend 3 dimensional objects vs a 2 dimensional object such as a painting or baseball card? Or perhaps its overall visual abilities are still nascent? Or will LLM’s such as CGPT never be able to really see anything clearly at all?
Computer vision experts and data scientists would probably advise that the training photo set should have been better annotated, formatted and organized, which would have taken me several full days to do. Additionally, I should have somehow taught CGPT (or any other AI system) that the 2 dimensional jpeg is actually a photo of a 3 dimensional object. Watches are multi-layered 3D objects, comprised of 1. a reflective plastic or sapphire crystal on top of 2. a flat dial with 2D script and protruding 3D h/m/s hands, on top of 3. a mechanical movement housed in 4. a casing which is enclosed with 5. a caseback and secured to the wrist with 6. a strap or bracelet. The training would probably entail teaching these basic characteristics first. Such an undertaking may be feasible for several watches but would be very costly and time-consuming to do for the thousands of watches that fit the scope of this Substack. As of this writing its also uncertain whether CGPT or any other AI tool would be able to correctly identify condition problems obvious to humans even with that optimized training. Perhaps CGPT’s visual abilities will improve, or as
explains, maybe LLM’s just aren’t cut out for visual comprehension and we need a new paradigm.So for now you’re basically stuck with me scrutinizing pictures with my sore old eyes, tapping my keyboard to move the blinking cursor along and write my auction highlights posts. But like a naive pupil, I’m optimistically waiting for the day AI can replace me.