On Euler’s Phi Function

In which we find that Euler’s phi function was neither phi nor a function.

First of all, a shout-out to all of my math(s) friends who are at (or traveling to) the Joint Mathematics Meetings in Baltimore! Now on to some math.

In my research for the “Evolution of…” series of posts, I came across the word totient in Steven Schwartzman’s The Words of Mathematics, which got me thinking about how Euler’s φ (phi) function—also called the “totient function”—came about. The word itself isn’t that mysterious: totient comes from the Latin word tot, meaning “so many.” In a way, it’s the answer to the question Quot? (“how many”?). Schwartzman notes that the Quo/To pairing is similar to the Wh/Th paring in English (Where? There. What? That. When? Then.). So much for the etymology.

It seems to me, though, that the more interesting questions are: who first defined it? how did the notation change over time? I did some digging, and here’s what I’ve discovered.

The first stop on my investigative tour was Leonard Dickson’s History of the Theory of Numbers (1952). At the beginning of Chapter V, titled “Euler’s Function, Generalizations; Farey Series”, Dickson has two things to say about Leonhard Euler:

“L. Euler… investigated the number φ(n) of positive integers which are relatively prime to n, without then using a functional notation for φ(n).”

“Euler later used πN to denote φ(N)…”

Each of these quotations contains a footnote, the first one to Euler’s paper “Demonstration of a new method in the theory of arithmetic” (written in 1758)  and the second to “Speculations about certain outstanding properties of numbers” (written in 1775). In the first paper, Euler is more interested in proving Fermat’s little theorem, which, true to form, he had already proven twice before. However, Euler does define the phi function (on p. 76, though as Dickson says, he doesn’t use function notation), and proves some basic facts about it, including the facts that φ(pm) = pm-1(p-1) [Theorem 3] and φ(AB) = φ(A)φ(B) when and B are relatively prime [Theorem 5]. This paper is in Latin, and while we do see the use of the words totidem and tot, they don’t seem to hold any special mathematical significance.

In the second paper, Euler returns to the phi function, having decided by this time to use π to represent it. Hard-core nerd that he is, Euler provides us with a table of values of πD for D up to 100, and replicates many of the facts he proved in the first paper. It’s interesting to note that, while Euler wrote this second paper in 1775, it wasn’t published until 1784, a year after his death.

It wasn’t until 1801, in Disquisiones Arithmeticae, that Carl Gauss introduced φN to indicate the value of the totient of N. So why did he pick φ rather than Euler’s π? Well, I checked the English translation by Arthur Clarke (no not, that Arthur Clarke), and I think it’s quite likely that he chose it for no discernible reason. In Clarke’s translation, Gauss introduces φ on page 20—and Gauss loved using Greek letters. In pages 5-19 (the beginning of Section II), he uses α, β, γ, κ, λ, μ, π, δ, ε, ξ, ν, ζ — and only after these does he use φ. As to the use of π, which was Euler’s notation, it’s possible that Gauss knew of Euler’s latter work and chose φ because he had already used π, but there’s no way to know for sure. (Also, π was already used for 3.14159… by this point, but if that was his reasoning, it’s odd that he used the symbol π at all.) Most likely, he just picked another Greek letter off the top of his head. It is important to remember that at no point did Gauss use function notation for the totient—it always appears as φN, never φ(N). (Also: Gauss goes on to use Γ and τ before getting tired of Greek and moving on to the fraktur letters 𝔄, 𝔅, and 𝖅.)

The next significant change came nearly a century later in J. J. Sylvester‘s article “On Certain Ternary Cubic-Form Equations,” published in the American Journal of Mathematics in 1879. On page 361, Sylvester examines the specific case npi, and says

pi-1(p-1) is what is commonly designated as the φ function of pi, the number of numbers less than pi and prime to it (the so-called φ function of any number I shall here and hereafter designate as its τ function and call its Totient).

While Sylvester’s usage of the word totient has become commonplace, mathematicians continue to use φ instead of τ. It just goes to show that a symbol can become entrenched in the mathematical community, even if a notational change would make more sense. Also of note is the fact that while Sylvester refers to the totient as a function, he doesn’t use the modern parenthesis notation, as in τ(n), but continues in Euler and Gauss’s footsteps by using τn.

And this is where our story ends. Sylvester’s use of the word totient, Gauss’s use of the letter φ, and Euler’s original definition all contributed to the modern construct that we call the phi/totient function. Even though Euler’s original definition came in a Latin paper, it wasn’t until Sylvester that the use of totient became commonplace.

However, Euler had proven many of the basic facts about it as early as 1758. So, while the original phi function was neither phi nor a function, it was undoubtedly Euler’s.


The Evolution of Weights and Measures

At long last, I’ve exhausted my curiosity in mathematical etymologies. Many word histories have been explored in the previous three installments:

This time around, I want to look at some of the words we use for measurements. There are a few interesting histories in the metric system (SI), but most of the fun comes from the English Imperial system.

The Roman Empire provided us with the primary pre-SI system of measurement in Europe, from which many of the medieval systems were derived. The Latin word mille gives us two important words today: million (related to “thousand”, as detailed in a previous post), and mile. As Roman legions marched across the Mediterranean world, they measured their distances according to paces, with a thousand paces being milia passuum. A pace is the distance traveled in two full steps, and is about 58-62 inches (depending, obviously, on an individual’s height). Using this reckoning, the Roman definition of a mile clocks in at 4,833-5,167 feet.

When the Roman Empire fractured in the West, their uniform measurement system fractured as well, occasionally with hilarious consequences. Later, by the 18th century, the Roman mile had evolved from one definition to many: there were Scots miles, English miles, German miles, and so on. The German mile was 24,000-some feet (at least according to Wikipedia), compared to the English mile’s comparably-paltry 5,280 feet. (Go check that Wikipedia reference, too—there are many more variants!)

But before I get too distracted by the history of the mile, let’s move on to some other length measurements.

  • Inch — this is a fun one. The word comes from the Latin uncia, which basically means “unit”. The strange thing is that an uncia was a unit of weight rather than length—it was 1/12th of a Roman pound. While the English inch is still 1/12th of its parent measure, the ounce somehow became 1/16th of a pound.
  • Furlong — rather simply, it’s a combination of furrow and long, with a furrow being the length of a ten-acre farm field. This makes it about 1/8th of a mile.
  • Yard and Rod — these two have an intertwined history. Today, a yard is 3 feet long, and a rod is 16.5 feet long. The word yard comes from Old English gierd, meaning “rod” or “stick.” Rod comes from the Old Norse rudda, meaning “club”. According to Schwartzman, the rod and the yard were used somewhat interchangeably during the Medieval period, and only later did they settle on 3 and 16.5 feet (or thereabouts)—the “short” and the “long” yard.   
  • Fathom — originating from the Old English fæðm (“faythm”), meaning “arms” or “grasp”. It was the length of a person’s outstretched arms, and is defined as 6 feet today. Perhaps, given its nautical use, a fathom was the distance you could fall off the boat while still being rescued by someone on board?

While there are lots of other words I could choose from, here are two in particular that have a surprising connection.

  • Pound — comes from the Latin pondus, meaning “a weight.” The abbreviation lb. comes from the Latin word libra, meaning “pound” or “balance.” In most markets, merchants would assess the value of precious metals offered for payment using a balance scale (still with us in the popular imagination today). Indeed, one of the signs of the Zodiac is a balance scale. Of course, you’d need to balance the payment against a set of known weights. Over time, then, the word for the weights themselves came to be the English pound, while the word for the scale itself (libra) evolved into its abbreviation.
  • Liter — comes from the Greek litra, which was a unit of weight. Yes, libra and litra have a common origin! Schwartzman notes that lytre and pound were used interchangeably in England as late as the 17th century. When France adopted a decimal system (the precursor to modern SI units), they borrowed the word litron, changing it from a unit of weight to a unit of volume.

There are many, many more words that I didn’t have the time or energy to write up! But hopefully it’s kept your interest throughout the whole series of posts. Get a copy of Schwartzman’s The Words of Mathematics if you want to learn more. 

What weighs more: a pound of gold or a pound of feathers?

Hi everyone! I had intended to write up a full etymology post this month, but time got away from me during the holidays. So for now, I offer an amusing fact taken from Jeff Suzuki’s book, Mathematics in Historical Context

You may know the old joke “What weighs more: a pound of gold, or a pound of feathers?” The answer, of course, is that a pound is the same regardless of what’s being weighed. However, this was not the case in the Medieval world! While the Romans imposed some uniformity of measurement on most of Europe, by Medieval times individual communities had developed their own variations. This bring us to Suzuki:

The complexity of the system of weights and measures is most obvious in what seems to be a nonsensical question: which weighs more, a pound of gold or a pound of feathers? Gold and other precious commodities were measured in Troy units, named after the semiannual trade fairs at Troyes in Champagne, France, where goods from throughout Europe could be exchanged. The Troy pound is divided into twelve troy ounces, and each ounce into twenty pennyweights, and each pennyweight into 24 grains: thus, a Troy pound is equal to 12 x 20 x 24 = 5760 grains. An avoirdupois pound (from the French “having weight”) was defined as having a weight of 7000 grains: thus a pound of gold (5760 grains) weighed less than a pound of feathers (7000 grains). Even more confusingly, the avoirdupois pound was divided into 16 inappropriately named ounces, so an ounce of gold (20 x 24 = 480 grains) was heavier than an ounce of feathers (7000 ÷ 16 = 437.5).

That’s it for now! I will return in the new year with one last post, on the origins of our words for weights and measures.

The Evolution of Plane Curves

This is the third post in the “Evolution Of…” series; the first and second posts can be found here and here.

This time around, we’ll explore some of the words that have come to be used for various plane curves. First of all, a disclaimer: often, the names of the curves existed many centuries before the development of modern algebra and the Cartesian coordinate system. As a consequence, the original names for the curves are more geometric in origin (imagine one of the ancient Greeks saying “umm, well, it looks like a flower… so let’s call it the flower curve“).

While reviewing the curves listed in Schwartzman’s book, I noticed that most of them can be classified into four major groups: the conics, the chrones, the trixes, and the oids.

  1. Conics. You’ve probably heard of them—circle, ellipse, parabola, and hyperbola. The first one has its origins in the Latin word circus, which means “ring” or “hoop.” The other three are Greek, with their original meanings reflecting the Greeks’ use of conic sectionsEllipse comes from en (meaning “in”) and leipein (meaning “to leave out”). For the other two, note that –bola comes from ballein which means “to throw” or “to cast”. So hyperbola means “to cast over” and parabola means “to cast alongside”. (If you check out this image from Wikipedia, it may start to make more sense.)
  2. Chrones. The two curves I have in mind here are brachistochrone and tautochrone. In Greek, chrone means “time”. The prefixes come from brakhus and tauto-, which mean “short” and “same”, respectively. So these curves’ names are really “short time” and “same time.” Naturally enough, the brachistochrone is the curve on which a ball will take the least amount of time to roll down, while the tautochrone is the curve on which the time to roll down is the same regardless of the ball’s starting point. Finding equations for these curves occupied the time of many scientists and mathematicians in the 17th century, including a controversy between the brothers Jakob and Johann Bernoulli. You could say they had a “chronic” case of sibling rivalry. 
  3. Trixes. I am a fan of the feminine suffix –trix because it also provides us with the modern word obstetrics (literally, “the woman who gets in the way”—i.e., a midwife). We don’t use this suffix very much anymore, though aviatrix comes to mind. The algebraic curves in this category are trisectrix (“cut into three”) and tractrix (“the one that pulls”), along with the parabola-related term directrix (“the one that directs”). Interestingly, the masculine form of tractrix gives us the English word tractor.
  4. Oids. These were the most fun for me. The suffix is Greek, originating in oeides, which means “form” (though in modern English, “like” might be more appropriate). Here are a some examples: astroidcardioidcissoidcochleoidcycloidramphoid, strophoid. Here are their original Greek/Latin meanings: “star-like”, “heart-like”, “ivy-like”, “snail-like”, “circle-like”, “like a bird’s beak”, “(having the) form of turning”. I’ve provided images of each one below—see if you can match the name to the curve!
Curve5 Curve6 Curve1
Curve2 Curve3  
Curve4 Curve7  

Don’t worry, there are plenty more word origins coming later! However, I’ll need a break to recharge my etymology batteries. Expect an “intermission” post in the next few weeks.

The Evolution of Arithmetic

This post is the second in a series; if you haven’t read the first post, on the evolution of English counting words, I’d recommend reading that one first.

As promised, this post looks at the origins of the English words for arithmetic operations. Read on, friend!

  • Plus and Minus. These two are fairly straightforward—they’re the Latin words for “more” and “less”, respectively. The symbols, though, are less clear. It appears that the letters p and m were used (sometimes appearing as p and m) during the 1400s—Wikipedia claims that these first appeared in Luca Pacioli’s Summa de Arithmetica, though I’ve been unable to find a satisfactory example. In the 1500s, the modern + and – signs began to appear; Schwartzman attributes the + to an abbreviation of the Latin “et” (taking the t only) and the – to the bar from m.
  • Multiply. This word comes from the Latin multiplicare, meaning “to increase.” Breaking it down a little further, we have the prefix multi– (“many”) and the suffix -plex (“fold”) so that the compound word multiplex means “many folds.” (We still use “fold” language today—when we speak of a “threefold increase,” we mean that something had been multiplied by three.) The x symbol for multiplication is attributed to William Oughtred, while Schwartzman gives credit for the dot • to Gottfried Wilhelm Leibniz.
  • Divide. This word comes from Latin as well, with the origin being dividere, meaning “to separate.” (As a side note, the root videre means “to see” and gives us the modern word video, which means “I see”.) Putting di– and videre together, I suppose this means that division is literally “to see in two.”

Notice that all four of these words originate in a description of the operation itself. It turns out that exponents and roots are a little more metaphorical in their meaning:

  • Exponent. Once again, we have a Latin origin: the prefix ex– and the verb ponere, roughly meaning “to put out.” Unlike the four arithmetic operations, though, the original meaning is typographical—the exponent is the number that is “put out” above and to the right of the base. In part, it’s because the exponent is a relatively new development; Schwartzman attributes the notation to Descartes, specifically La Géométrie (1637).
  • Root. Finally, a non-Latin word! The word rot means “cause” or “origin”, which makes sense when you consider that since 8 = 23, its “origin” is 2. If you trace the word further back, the Proto-Indo-European root (see what I did there?) is wrad-. Thus, the Latin-based words radical and radish come from a source similar to root.

And there you have it! In the next installment, I’ll get a little more geometric and explore some words we’ve come to use for algebraic curves.

The Evolution of Numbers

I’ve always loved word origins. Often, knowing where a word comes from can provide you with insight on what it means today. At the very least, it makes you look smart at parties (e.g., “Did you know that the word apocalypse shares the same Greek root as calypso…”).

This post is the first in a series on mathematical word origins. Since math(s) is an old subject, many mathematical terms have ancient roots (for English, this usually means Greek and Latin). For today, we’ll explore the origins of the English words for counting and arithmetic. For all posts in this series, my primary source is Steven Schwartzman’s The Words of Mathematics, with an occasional assist from the Online Etymology Dictionary.

  1. One through Ten. I’m actually going to skip these; most Indo-European languages have a base 10 system whose words are in rough correspondence with each other (for instance, the word for 6 is six, sechsseiseis, and seks, in French, German, Italian, Spanish, and Norwegian, respectively). If you’re interested in exactly how those consonants correspond, go read up on Grimm’s law.
  2. Eleven and Twelve. These words have a particularly Germanic origin: for example, in French you’d use onze and douze while in German it’s elf and zwölf. The English word eleven comes from the Old English endleofon, which basically translates as “one left over.” If you were counting up 11 items, you’d get to ten, and then say “and there’s one left over.” Eventually, this got condensed down into our modern eleven. (It makes a certain amount of sense, no?) The same thing goes for twelve: the Old English twelf comes from the Proto-Germanic twa-lif, which means “two left.”
  3. Thousand and Million. The word thousand comes from the Germanic thus (thick) and hund (hundred), making a thousand a “thick hundred.” In the Romance languages, though, the word thousand comes from the Latin mille. The English word million originally meant “a great thousand.” Interestingly, there appears to be a connection between the words thousand and dozen (e.g., the Dutch word for a thousand is duizend), leading some scholars to speculate that some Germanic cultures had a mixed base-10 and base-12 system. This may also be seen in the fact that in the UK, a “hundredweight” is 112 lbs.
  4. Zero. This one’s a relatively new addition to English—according to Schwartzman, the first appearance in print of the word zero was in Philippi Calandri’s De Arithmetica Opusculum (1491). The word itself was borrowed from the Arabic صفر (sifr) which means “empty.” Interestingly, the word cipher has the same origin. One other point: in most European languages, zero is treated a plural (“I have zero apples” instead of “I have zero apple”). I’d be interested if this is the case in Arabic and other semitic languages.

Up next: the operations of arithmetic…