Jump to content
php.lv forumi

Kuru koledžu izvēlēties?


rrr77

Recommended Posts

Originally:

https://gist.github.com/7565976a89d5da1511ce

 

Hi Donald (and Martin),

 

Thanks for pinging me; it's nice to know Typesafe is keeping tabs on this, and I

appreciate the tone. This is a Yegge-long response, but given that you and

Martin are the two people best-situated to do anything about this, I'd rather

err on the side of giving you too much to think about. I realize I'm being very

critical of something in which you've invested a great deal (both financially

and professionally) and I want to be explicit about my intentions: I think the

world could benefit from a better Scala, and I'd like to see that work out even

if it doesn't change what we're doing here.

 

Right now at Yammer we're moving our basic infrastructure stack over to Java,

and keeping Scala support around in the form of façades and legacy libraries.

It's not a hurried process and we're just starting out on it, but it's been a

long time coming. The essence of it is that the friction and complexity that

comes with using Scala instead of Java isn't offset by enough productivity

benefit or reduction of maintenance burden for it to make sense as our default

language. We'll still have Scala in production, probably in perpetuity, but

going forward our main development target will be Java.

 

So.

 

Scala, as a language, has some profoundly interesting ideas in it. That's one of

the things which attracted me to it in the first place. But it's also a very

complex language. The number of concepts I had to explain to new members of our

team for even the simplest usage of a collection was surprising: implicit

parameters, builder typeclasses, "operator overloading", return type inference,

etc. etc. Then the particulars: what's a Traversable vs. a TraversableOnce?

GenTraversable? Iterable? IterableLike? Should they be choosing the most general

type for parameters, and if so what was that? What was a =:= and where could

they get one from?

 

A lot of this has been waved away as something only library authors really need

to know about, but when an library's API bubbles all of this up to the top (and

since most of these features resolve specifics at the call site, they do),

engineers need to have an accurate mental model of how these libraries work or

they shift into cargo-culting snippets of code as magic talismans of

functionality.

 

In addition to the concepts and specific implementations that Scala introduces,

there is also a cultural layer of what it means to write idiomatic Scala. The

most vocal — and thus most visible — members of the Scala community at large

seem to tend either towards the comic buffoonery of attempting to compile their

Haskell using scalac or towards vigorously and enthusiastically reinventing the

wheel as a way of exercising concepts they'd been struggling with or curious

about. As my team navigated these waters, they would occasionally ask things

like: "So this one guy says the only way to do this is with a bijective map on a

semi-algebra, whatever the hell that is, and this other guy says to use a

library which doesn't have docs and didn't exist until last week and that he

wrote. The first guy and the second guy seem to hate each other. What's the

Scala way of sending an HTTP request to a server?" We had some patchwork code

where idioms which had been heartily recommended and then hotly criticized on

Stack Overflow threads were tried out, but at some point a best practice

emerged: ignore the community entirely.

 

Not being able to rely on a strong community presence meant we had to fend for

ourselves in figuring out what "good" Scala was. In hindsight, I definitely

underestimated both the difficulty and importance of learning (and teaching)

Scala. Because it's effectively impossible to hire people with prior Scala

experience (of the hundreds of people we've interviewed perhaps three had Scala

experience, of those three we hired one), this matters much more than it might

otherwise. If we take even the strongest of JVM engineers and rush them into

writing Scala, we increase our maintenance burden with their funky code; if we

invest heavily in teaching new hires Scala they won't be writing production code

for a while, increasing our time-to-market. Contrast this with the default for

the JVM ecosystem: if new hires write Java, they're productive as soon as we can

get them a keyboard.

 

Even once our team members got up to speed on Scala, the development story was

never as easy as I'd thought it would be. Because one never writes pure Scala in

an industrial setting, we found ourselves having to superimpose four different

levels of mental model — the Scala we wrote, the Java we didn't write, the

bytecode it all compiles into, and the actual problem we were writing code to

solve. It wasn't until I wrote some pure Java that I realized how much extra

burden that had been, and I've heard similar comments from other team members.

Even with services that only used Scala libraries, the choice was never between

Java and Scala; it was between Java and Scala-and-Java.

 

Adding to the unease in development were issues with the build toolchain. We

started with SBT 0.7, which offered a pleasant interface to some rather dubious

internals, but by the time SBT 0.10 came out, we'd had endless issues trying to

debug or extend SBT. We looked at using 0.10, but we found it to have the exact

same problems managing dependencies (read: Ivy), two new, different flavors of

inpenetrable, undocumented, symbol-heavy API, and an implementation which can

only be described as an idioglossia. The fact that SBT plugin authors had to

discover what "best practices" are in order to avoid making two plugins

accidentally incompatible should have been a red flag for any tool which

includes typesafety as a selling point. (The fact that I tried to write a plugin

to replace SBT's usage of Ivy with Maven's Aether library should have been a red

flag for me.)

 

We ended up moving to Maven, which isn't pretty but works. We jettisoned all of

the SBT plugins I wrote to duplicate Maven functionality, our IDE integration

worked properly, and the rest of our release toolchain (CI, deployment, etc.) no

longer needed custom shims to work. But using Maven really highlighted the

second-class status assigned to it in the Scala ecosystem. In addition to the

"enterprisey" cat-calls and disbelief from the community, we found out that

pointing out scalac's incremental compilation bugs had gotten that feature

removed outright. Even the deprecation warning for -make: suggests using SBT or

an IDE. This emphasis on SBT being the one true way has meant the

marginalization of Maven and Ant -- the two main build tools in the Java

ecosystem.

 

Cross-building is also crazy-making. I don't have any good solutions for

backwards compatibility, but each major Scala release being incompatible with

the previous one biases Scala developers towards newer libraries and promotes

wheel-reinventing in the general ecosystem. Most Scala releases contain

improvements in day-to-day programming (including compilation speed), but an

application developer has to wait until all their dependencies are upgraded

before they themselves can upgrade. If they can't wait, they have to take on the

maintenance burden of that library indefinitely. In order to reduce their

maintenance overhead, they naturally look for another, roughly equivalent

library with a more responsive author. Even if the older library is better-

tested, better-documented, and better-featured it will still lose out over time

as developers jump ship for something that works with Scala 2.next sooner. (It's

also worth noting that most companies using Scala at scale or in mission-

critical capacities will not immediately upgrade; the library authors they

employ will likely be similarly conservative, and the benefit their experience

brings to their code will benefit the community less and less over time. As far

as I've found, we're the only big startup in SF using 2.9.)

 

Once in production, Scala's runtime characteristics were the least subtle

problem. At one point, half the team was working on a distributed database, and

given the write fanout for our large networks some parts of the code could be

called 10-20M times per write. Via profiling and examining the bytecode we

managed to get a 100x improvement by adopting some simple rules:

 

1. Don't ever use a for-loop. Creating a new object for the loop closure,

passing it to the iterable, etc., ends up being a forest of invokevirtual calls,

even for the simple case of iterating over an array. Writing the same code as a

while-loop or tail recursive call brings it back to simple field access and

gotos. While I'm sure Scala will be have better optimizations in the future, we

had to mutilate a fair portion of our code in order to actually ship it. (In

another service, we got away with just using the ScalaCL compiler plugin and

copying things to and from arrays instead of using immutable collections.)

 

2. Don't ever use scala.collection.mutable. Replacing a

scala.collection.mutable.HashMap with a java.util.HashMap in a wrapper produced

an order-of-magnitude performance benefit for one of these loops. Again, this

led to some heinous code as any of its methods which took a Builder or

CanBuildFrom would immediately land us with a mutable.HashMap. (We ended up

using explicit external iterators and a while-loop, too.)

 

3. Don't ever use scala.collection.immutable. Replacing a

scala.collection.immutable.HashMap with a java.util.concurrent.ConcurrentHashMap

in a wrapper also produced a large performance benefit for a strictly read-only

workload. Replacing a small Set with an array for lookups was another big win,

performance-wise.

 

4. Always use private[this]. Doing so avoids turning simple field access into an

invokevirtual on generated getters and setters. Generally HotSpot would end up

inlining these, but inside our inner serialization loop this made a huge

difference.

 

5. Avoid closures. Ditching Specs2 for my little JUnit wrapper meant that the

main test class for one of our projects (~600-700 lines) no longer took three

minutes to compile or produced 6MB of .class files. It did this by not capturing

everything as closures. At some point, we stopped seeing lambdas as free and

started seeing them as syntactic sugar on top of anonymous classes and thus

acquired the same distaste for them as we did anonymous classes.

 

Now, every language has its performance issues, and the best a standard library

can hope to do is to hit 80% of use cases. But what we found were pervasive

issues — we could replace all of our own usages of s.c.i.HashMap, but it's a

class which is extensively used throughout the standard library. It being slower

than j.u.HashMap means groupBy is slower, as is a lot of other collections

functionality I like.

 

At some point, I wondered if the positive aspects of our development experience

owed less to Scala and more to the set of libraries we use, so I spent a few

days and roughly ported a medium-sized service to pure Java. I broached this

issue with the team, demo'd the two codebases, and was actually surprised by the

rather immediate consensus on switching. There's definitely aspects of Scala

we'll miss, but it's not enough to keep us around.

 

Already I've moved our base web service stack to Java, with Scala support as a

separate module. New services are already being written on it, and given the

results from our Hack Day at the beginning of this week it hasn't slowed our

ability to quickly ship complex code. I'm keeping a close eye on the effects of

this change, but I'm optimistic, and the team seems excited. We'll see.

 

So.

 

I've tried hard here not to offer you advice. Some of these problems could

easily be specific to our team and our workload; some of them won't make a

difference in how your company does; some of them aren't even your problems to

solve, really. But they're still the problems we've encountered over the past

two years, and they compose the bulk of what's motivating this change.

 

Despite the fact that we're moving away from Scala, I still think it's one of

the most interesting, innovative, and exciting languages I've used, and I hope

this giant wall of opinion helps you in some way to see it succeed. If there's

anything here I can clarify for you, please let me know.

Manuprāt šeit par skalu viss ir pateikts. Plus mans personiskais viedoklis, ka scala izskatās pēc kāda akadēmiķa atvemtajām brokastīm. :)

Link to comment
Share on other sites

  • Replies 63
  • Created
  • Last Reply

Top Posters In This Topic

Manuprāt šeit par skalu viss ir pateikts. Plus mans personiskais viedoklis, ka scala izskatās pēc kāda akadēmiķa atvemtajām brokastīm. :)

 

Tipisks komentārs cilvēkam, kurš nav strādājis ar scalu, bet redzējis tikai pa gabalu un pēc dabas ir tik pastaisns, ka uzskata, ka tikai viņam vienmēr ir taisnība.

Principā to tu varētu pateikt par jebkuru valodu, kuru neesi apguvis.

 

Katru argumentu citātā nepretstatīšu, jo daudzi no minētajiem mīnusiem patiesībā ir plusi, ja māk tos lietot, bet 2 galvenās lietas, par ko tajā sūdzas, ir:

1)Valodas sarežģītība.

Valoda šķiet sarežģīta tikai tiem, kas viņu nav apguvuši, jo satur jaunas koncepcijas, kuras citās valodās nav sastopamas.

Viss, ko var darīt JAVĀ, scālā ir izdarāms ļoti vienkārši. Savukārt viss jaunais, kas šķietami padara valodu sarežģītu parasti JAVĀ vai nu nav iespējams, vai prasa 10-20x vairāk koda.

 

2)Ātrdarbība.

Jā, protams, scalas kolekciju klases ir nedaudz lēnākas par JAVAs kolekciju klasēm, taču:

1) Tavs ieliktais citāts ir divus gadus vecs un scala šajā laikā ir piedzīvojusi pamatīgu attīstību un minētie argumenti ir novecojuši.

2) Pat, ja scalas kolekcijas ir nedaudz lēnākas par javu, tās ir daudzkārt ātrākas par php, pyhton, js, ruby, u.c.

3) scalā pilnīgi nekas netraucē izmantot javas kolekcijas kā gadijumā ātrums bus tieši tāds pats.

 

 

Ja pameklēsi netā, tad atradīsi ļoti daudz piemēru, kur JAVA koderi pāriet uz scalu un atpakaļ neskatās un atcērās JAVAs pieredzi kā sāpīgas atmiņas.

Scalā ir rakstīta nozīmīga Twitter, Linkedin, FourSquare, Seesmic backend daļa, scalu izmanto arī daudzas citas lielas kompānijas, kā Sony un Simens: http://www.scala-lang.org/old/node/1658

 

Tas, ka vienam developerim Yammerā ir grūtības ar jaunas valodas apgūšanu un tās pareizu izmantošanu vēl neraksturo visu ekosistēmu.

Edited by codez
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...