I took the first few weeks of an ancient philosophy class and found it extremely frustrating. I generally love philosophy, but ancient philosophy was somehow different. Throughout every reading and every lecture, I couldn’t help but think that ancient philosophy was just a narrow-minded subset of data science. This might sound reductive—perhaps altogether absurd—until you think about what ancient philosophers like Socrates were really after.
Socrates and Plato spent a lot of time trying to get at the essence of concepts like virtue, courage, and happiness. Typically, Socrates would bait someone into defining a broad term like “courage”, then would go on to refute the definition with a string of counter-examples. After having done this, he would conclude that they—Socrates and the collocutor—had no clue what “courage” was.
The TA for the class did a similar thing in our discussion section. He posed a simple question: what is a watch? It turned out that nobody knew. Is a pocket watch a watch? What about a smartwatch? What if you strap your phone to your arm—is that a watch? It went on, case after case, and it seemed more and more difficult to nail down a universal definition. In fact, it seemed that “watchness” had more to do with brand marketing than with the function or structure of the product. The newest Fitbit is essentially a smartwatch, yet people don’t call it that. What is it that makes the Moto 360 a watch while the Fitbit is not? None of us could figure it out.
What frustrated me at the time, and probably led me to drop the class, was that philosophers were approaching the problem from the wrong direction. Defining terms is—and always has been—a simple matter of classification. Data scientists have been dealing with this problem for years. In the case of a watch, we want a function that maps an object to a yes-or-no label (”watch” or “no watch”). The function could be anything: a neural network, a decision tree, an SVM, you name it. However, for some reason, philosophers only seemed to want to use one kind of classifier: a sentence.
Essentially, philosophers like Socrates were trying to make classifiers out of language. They wanted a definition that could classify every instance correctly. So, as a data scientist, I am not surprised that Socrates struggled so much. Of course a human knows what a watch is; we are essentially huge neural networks. However, there is absolutely no reason to assume that our neural circuitry could be compressed down to less than a kilobyte (the size of a lengthy paragraph).
With this being said, definitions can be extremely useful. Just like simple regression models, definitions are low variance. Basically, a definition gives you a “good sense” of a categorization without needing tons of data. However, one should never expect a definition to fulfill the duties of a high variance classifier like a deep neural network. Definitions cannot and should not cover every edge case. Capturing the intricacies of meaning is a job for a complex model like the human brain. That, I believe, is what Socrates was missing.