Iliyan Bobev: 2012

Sunday, December 16, 2012

Chomsky vs Norvig

I respect Peter Norvig, and there is no denial that he has made many contributions to science, but in this argument I tend to side with Chomsky, and here's why:

Articles sources:

Chomsky: http://www.theatlantic.com/technology/archive/2012/11/noam-chomsky-on-where-artificial-intelligence-went-wrong/261637/?single_page=true

Norvig: http://norvig.com/chomsky.html

Short Q-A-Comment exert:

Norvig: I take Chomsky's points to be the following:

Chomsky: Statistical language models have had engineering success, but that is irrelevant to science.
Norvig: I agree that engineering success is not the goal or the measure of science. But I observe that science and engineering develop together, and that engineering success shows that something is working right, and so is evidence (but not proof) of a scientifically successful model.
Bobev: The engineering success in the current case can only be evidence of "something working right" with the statistical model - and it's long proven that statistics is scientifically successful model.

Chomsky: Accurately modeling linguistic facts is just butterfly collecting (I'd use "cataloging butterfly in attempt to determine how they fly"); what matters in science (and specifically linguistics) is the underlying principles.
Norvig: Science is a combination of gathering facts and making theories; neither can progress on its own. I think Chomsky is wrong to push the needle so far towards theory over facts; in the history of science, the laborious accumulation of facts is the dominant mode, not a novelty. The science of understanding language is no different than other sciences in this respect.

Bobev: I agree that science is about gathering facts, but in the current case the facts being gathered do not find application in science, but in engineering. What's the scientific value can be derived by the fact that the probability of "am" following "I" is say 50% ? Any other facts, are relative to the automation and resolving difficulties of obtaining and sorting and storing these probabilities. In the same time, there is very little done for obtaining, sorting and analyzing languages.

Chomsky: Statistical models are incomprehensible; they provide no insight.
Norvig: I agree that it can be difficult to make sense of a model containing billions of parameters. Certainly a human can't understand such a model by inspecting the values of each parameter individually. But one can gain insight by examining the properties of the model—where it succeeds and fails, how well it learns as a function of data, etc.

Bobev: As Chomsky says, it's not that statistical model cannot provide any insights, but it cannot provide insights to the question we are interested in: How does the brain use language on physiological level.

Chomsky: Statistical models may provide an accurate simulation of some phenomena, but the simulation is done completely the wrong way; people don't decide what the third word of a sentence should be by consulting a probability table keyed on the previous two words, rather they map from an internal semantic form to a syntactic tree-structure, which is then linearized into words. This is done without any probability or statistics.
Norvig: I agree that a Markov model of word probabilities cannot model all of language. It is equally true that a concise tree-structure model without probabilities cannot model all of language. What is needed is a probabilistic model that covers words, trees, semantics, context, discourse, etc. Chomsky dismisses all probabilistic models because of shortcomings of particular 50-year old models. I understand how Chomsky arrives at the conclusion that probabilistic models are unnecessary, from his study of the generation of language. But the vast majority of people who study interpretation tasks, such as speech recognition, quickly see that interpretation is an inherently probabilistic problem: given a stream of noisy input to my ears, what did the speaker most likely mean? Einstein said to make everything as simple as possible, but no simpler. Many phenomena in science are stochastic, and the simplest model of them is a probabilistic model; I believe language is such a phenomenon and therefore that probabilistic models are our best tool for representing facts about language, for algorithmically processing language, and for understanding how humans process language.

Bobev: I totally agree that the two approaches should be combined at some point. What I'm unhappy about is that the current focus of the entire field is on probabilistic model. Every new paper is based solely on statistics.

Chomsky: Statistical models have been proven incapable of learning language; therefore language must be innate, so why are these statistical modelers wasting their time on the wrong enterprise?
Norvig: In 1967, Gold's Theorem showed some theoretical limitations of logical deduction on formal mathematical languages. But this result has nothing to do with the task faced by learners of natural language. In any event, by 1969 we knew that probabilistic inference (over probabilistic context-free grammars) is not subject to those limitations (Horning showed that learning of PCFGs is possible). I agree with Chomsky that it is undeniable that humans have some innate capability to learn natural language, but we don't know enough about that capability to rule out probabilistic language representations, nor statistical learning. I think it is much more likely that human language learning involves something like probabilistic and statistical inference, but we just don't know yet.

Bobev: I agree that we cannot rule out involvement of probabilistic element in some aspects of language use, but I think it's pretty obvious that the language cannot be based only on probabilistic representation. I cannot cite who proved what when, but I know that if I can "invent" new words, and other people still understand me, I have bridged any statistical representation of the language, but I'm still adhering to the language model used by the mind. What you work with is a statinguage - how can statistical model handle that?

In-line comments on some points made by Norvig

Norvig: If you have a vocabulary of 100,000 words and a second-order Markov model in which the probability of a word depends on the previous two words, then you need a quadrillion (10^15) probability values to specify the model. The only feasible way to learn these 10^15 values is to gather statistics from data.
Bobev: @4bytes per Float value, that makes 3.5 petabytes, and the most generous estimations of human brain capacity for non-chemical storage is 2.5 petabytes. But even if we allow for chemical storage, or if we assume that brain stores data at 1byte per Float value, the language we use is 300 000 words, which will require much more than the 10^15. And if we go to third-order Markov model? And what of learning 2nd language? And what of all other knowledge or memories? For me it's obvious, that we do not store such info in our brains. So if our brains use language more efficiently and correctly at the same time, there must be a different representation of language in the mind.

Norvig: Clearly, it is inaccurate to say that statistical models (and probabilistic models) have achieved limited success; rather they have achieved a dominant (although not exclusive) position.
Bobev: statistical models (and probabilistic models) have achieved no success in explaining how language is represented or used in our mind or on a physiological level. Statistical models are dominant at specific language based contests/tasks, because they are cheats - they strive to replicate only the outcome, but not the process. And all the "progress" is due to computation power - after all the Bayesian networks have not changed since 1980s. I consider useful the hybrid models, if they provide some understanding of how to incorporate statistical and rules based processing.

Norvig: Another measure of success is the degree to which an idea captures a community of researchers. As Steve Abney wrote in 1996, "In the space of the last ten years, statistical methods have gone from being virtually unknown in computational linguistics to being a fundamental given. ... anyone who cannot at least use the terminology persuasively risks being mistaken for kitchen help at the ACL [Association for Computational Linguistics] banquet."
Bobev: That is exactly why I'm so upset. Many researchers have been pushed towards statistical methods by conformity. The rapid success of the engineering aspect of the field created a lot of hype, which has shifted a lot of interest, support and funding from the goals understanding how it works, instead of simulating results.

Norvig: A dictionary definition of science is "the systematic study of the structure and behavior of the physical and natural world through observation and experiment," which stresses accurate modeling over insight,...
Bobev: Wait, what? 1) I don't find a mention of "modeling" in that definition, let alone a stressed one. 2) If you want to read "study of the structure and behavior" as "modeling", it should be more in the notion of "discover underlining model" rather than "create a model that simulates". It's important to clarify what you will study exactly, because it's one thing to study the human use of the language, and another to study machine use of language.

Norvig: It certainly seems that this article is much more focused on "accurately modeling the world" than on "providing insight."
Bobev: Again, where do you see modeling? Science aims at understanding the what governs the observed phenomenon - if the paper addresses the efficiency of some electrodes, it's doing so in order to explain why. Scientists do experiments to confirm specific idea - insight if you will. The subject involved with striving to create a system that yields specific results is engineering - the outcomes of that process are called "prototypes", not "experiments".

Norvig: and for the 2010 Nobel Prizes in science:
Physics: for groundbreaking experiments regarding the two-dimensional material graphene
Chemistry: for palladium-catalyzed cross couplings in organic synthesis
Physiology or Medicine: for the development of in vitro fertilization
My conclusion is that 100% of these articles and awards are more about "accurately modeling the world" than they are about "providing insight," although they all have some theoretical insight component as well. I recognize that judging one way or the other is a difficult ill-defined task, and that you shouldn't accept my judgements, because I have an inherent bias.
Bobev: Well, you got it wrong. Unless they have the theoretical insight "component" (probably more of a core, really), no one would consider them science. There is some rule of thumb in science, that says that experiments should be chosen so that they can clearly prove or disprove a hypothesis that you have already formed. There are of course accidental discoveries, and sometimes throwing a random experiment with no clear goal or idea, might set you on the right track, but even with this approach you have to fit the results in theory insight.

Norvig: I repeated the experiment, using a much cruder model with Laplacian smoothing and no categories, trained over the Google Book corpus from 1800 to 1954, and found that (a) is about 10,000 times more probable. If we had a probabilistic model over trees as well as word sequences, we could perhaps do an even better job of computing degree of grammaticality.
Furthermore, the statistical models are capable of delivering the judgment that both sentences are extremely improbable, when compared to, say, "Effective green products sell well." Chomsky's theory, being categorical, cannot make this distinction; all it can distinguish is grammatical/ungrammatical.
Bobev: OK, let's place the probability results for these three sentences in scale - plot them on interval (0,1). You haven't provided the exact values, but I have some idea of what numbers are yielded from these statistics. The first two sentences are so close to 0, that you feel it wont be in favor of your argument to show all the zeroes between the decimal point and the meaningful numbers. In that respect the first two sentences might differ by 10,000 times, and still be really close at each other, being in the zone of 10^-15 as a value. Now these two sentences might have got the right probability order just as a fluke - I seriously doubt that the same results will be yielded with the same sentences, but with the color replaced. Go through all the colors and let me know if the results aren't wrong, at least at 50% of the time.
And why would we care that there is some other sentences that is extremely more probable? What has it to do with determining whether a given sentence is grammatical? Because that's what we need. If we as humans can determine if a sentence is grammatical or not with certainty, the desired model should be able to do the same. Only then, we can argue that such model may be physiologically implemented in the human brain.

Norvig: "All grammars leak."
Bobev: Agreed. We need to incorporate statistics in modeling human use of language, but it's more than just probabilities.

Norvig: Since people have to continually understand the uncertain. ambiguous, noisy speech of others, it seems they must be using something like probabilistic reasoning. Chomsky for some reason wants to avoid this, and therefore he must declare the actual facts of language use out of bounds and declare that true linguistics only exists in the mathematical realm.
Bobev: I don't think Chomsky wants to avoid the use of probabilistic methodology, but rather that he's "concerned with discovering a mental reality underlying actual behavior" in humans. He believes that such is the goal of linguistics, and you believe it's the creation of "statistical (or probabilistic) models, which while accurately modeling reality, do not make claims to correspond to the generative process used by nature".
Conclusion: So the whole thing is comparing apples and oranges -- you simply strive for different things.

I personally do not care what's the goal of linguistics as a science, but I tend to side with Chomsky in the view that efforts for making models which "make no claim to correspond to the generative process used by nature" are of no use (or are detrimental) for discovering what underlines actual behavior in language use.

IMHO the only way to advance both aspects (accurate modeling and discovering the process used by nature) is to go hybrid.

Tuesday, December 11, 2012

I think the movie "In the Mood for Love" is great

Recently I watched a VSause video: http://youtu.be/i8tHWPRPF9M ,
which led me to http://www.youtube.com/watch?v=JnQmJaqUJ0Y&feature=share&list=SP77A0EB0EAF0FEFEB

As the authors of the show suggest, it's best to watch the movie first and then view the discussion of it - that's how I did it. Same is valid for the comments below - I'll try to avoid spoilers, but it's best to make your own impressions first.

So I watched the movie In the mood for love by Kar Wai Wong and I really liked it, and here is why:
I love the visuals and style - they are vivid, consistent and inspiring. The scene of the corridor with red drapes is unforgettable. Pacing came a bit slow for me, but it was worth it, and probably helps the emotional build up. The technique of having the protagonists practice what they want to do, but never did, was very effective for me. Music was also well used to portray the mood.
In terms of story, I think that society has changed a lot since the 60s and today people in the same situation would definitely choose to their own happiness instead of adhering to the society expectations. Non the less, the emotions portrait would still be vibrant and tangible today, as they were years ago. That makes the story transcend the choice of time and place, and that's what makes it great.
It's not a movie that will stun you, but it's a movie that will make you feel, and think, and that is something. It stands out, compared to the tons of meaningless films pouring out of Hollywood these days.
I will definitely explore more of the works of Kar Wai Wong.

Monday, November 26, 2012

I think that the movie Dark Knight Rises is weak

Few days ago I strongly expressed my opinion of the Dark Knight Rises, which is that the movie is weak. Here are some of the arguments I presented in the subsequent argument.
Although the movie at the moment has score of 8.8 in IMDB at the moment, it failed to meet my expectations. I don't think that meeting the expectations of 0.5 million people, makes it good, and only means that many people have low expectations.
I do admit that my opinion is subjective, because I really liked the previous film, and that set the bar high for this installment. If I haven't previously watched The Dark Knight, I would have probably have given the current film higher score. I might have presented my opinion quite expressively, and I beg that to be excused on account of my strong emotional affection derived on part of subjective factors. However although the subjective might have affected my reaction, there are a lot of objective aspects of the film that cannot be regarded as other than weak.
Although the film was compared with other movies based on comics, I do not think that is correct, as the common source of ideas does not correspond to equality in genres. I find Ironman and Avengers to gravitate towards comedy, while The Dark Knight trilogy is more of a drama. Therefore I expect different depth and effort form these different movies.
I also need to mention that there are still may elements in the Dark Knight Rises that I like, but they cannot outweigh the fails, and the overall evaluation remains negative. Similarly I have very positive opinion on the The Dark Knight, but it's far from perfect. I find many issues, but they are not in the core elements and can be ignored.
SPOILERS SPOILERS SPOILERS SPOILERS SPOILERS
to avoid spoilers stop reading here
Lets dive in with some examples and comparisons:
One of the things that tilted the scales in the negative direction, was the interruption of the story tempo. Everything was flowing orderly, until the placed our hero in the well prison, and then the time jumped ahead for moths, and suddenly our hero from beaten and sick, became healthy and strong. Besides breaking the story line, that also is a cheat. The hero is a hero, because he overcomes his weaknesses through suffering and pain, thanks to his will. This is totally hidden in this case, as if the passing of time is the only prerequisite for the hero to be reborn with the qualities needed to pass previously impassible obstacles. Compare that with The Count of Monte Cristo - which is more realistic and believable. If man is imprisoned, he gets crushed.The man is deprived of freedom and future has to find new hope in order to prevail. In the Count, the protagonist suffers, and from the suffering he reforms. In The Shawshank Redemption the main character suffers the usual unpleasantness, before he realizes there is something deep inside your body, that people can't touch and get to....'HOPE'. In Batman, the whole thing is presented so short, it fails to represent the suffering. The idea is clearly there, but the execution destroys it. Here should be the essence of the film, and what I got is that in one scene he gets up of the floor, on the second 5 push-ups, and on the third he's climbing the wall. I get the feel that the author wants to show how the protagonist finds new hope and new purpose, but I don't see that he lost it in first. They especially elaborate that the prison is designed to crush hope by first giving it in the form of the patch of sky, and then taking it away by the repetitive failure of the escape attempts. Well it didn't work and I didn't see the character feeling imprisoned i.e. crushed. This results in no clear trigger of the internal change. Was it the life in well? The few words of a cellmate in a foreign language? Was it the images on the TV screen in the cell? Or some graffiti on the prison wall they never showed? There had to be some change happening, because Alfred was telling him the same thing from the start "you need purpose in life", yet then it had no effect. The entire part of felt as if getting out was the most natural thing, that they let him go out through the front door. I think this portion of the story had to be presented in much more detail, as it's the resolution to the hero's dilemma, but instead it's fragmented and pretty much ignored.
I find most of the characters in the movie bleak and incomplete. In fact the two characters that I think are built fine, are Alfred and Blake. The worst character is the Peter Foley (Deputy Police Commissioner). He is presented so weak and incompetent while there is no explanation why is there such a man on a position of significance that demands the opposite qualities. How did he ended in the police force, instead of fast food. But lets assume that author didn't bother with secondary character, and compare the main villains in the last two films - The Joker and Miranda. From the Jokers actions the viewer can determine that his motives are desire for chaos and anarchy, he wants to prove to the world that even in the most exemplar and honest man, can be corrupted, that everyone that stands for the 'good' carries the capacity for 'bad'. To the world, the Joker himself, presented his goals as 'exposing the corruption', which adds another layer to the character - how he wants to present himself, and potentially how he manages to win followers to his side. It is made clear that his view of the world is result of traumatic childhood, which makes the character very believable. I can't find any strong motivators that would explain the actions by Miranda, so I have to speculate. Does she want to avenge her father, or does she want to complete what he started? What in her character explains her strong determination? Is it love or grief that makes her ready to kill millions and sacrifice herself in the process? Usually, when someone is willing to sacrifice his life, it's for a idea or a cause that will live on past them. It's clear for the Joker - if Batman kill him, that will be a prove that Batman is both the judge and the executor, and that is not justice. Miranda never tells the world that this is her vengeance, so what will she prove if her plan got carried out? And why is she on this path in first place? Her father got killed because he was planning to kill many people (and I don't remember the first movie that well, but might have been an accidental death). I recon a normal people don't rush to carry out the legacy project for mass murder, even if that project belonged to their father. So for her following that goal will require for her father to have won her trust, first as a person. Then to convince her in his values, and win her to the cause. I find it much more realistic when kids rebel against their parents, instead of blindly following them.
The main idea of the two movies also differs and I'll try to compare them next.
In the Dark Knight the main idea is less explored and much more interesting. The dilemma is whether to accept the guilt of another if that will keep an idea alive. If one is ready to sacrifice his life for idea, he should be ready to sacrifice his honor for the idea. But there is a twist, death might be preferable to life filled with the hate and resentment of the ones you have sacrificed for.
What's the main idea of the Dark Knight Rises, what's the protagonist's dilemma? As I mentioned previously, I think that the main character struggles with finding new hope and purpose in life. I believe that something happened while the character was in the well, something that got him what he couldn't find for 7 years. And since I cant clearly state what that something is, the author didn't do a good job on carrying out the main idea of the story.

I may continue this, if the argument I'm in, carries on

Sunday, October 28, 2012

Why We Can't Solve Big Problems

Recently I read the article Why We Can't Solve Big Problems: http://www.technologyreview.com/featuredstory/429690/why-we-cant-solve-big-problems/

Here are my thoughts on that:
I think LHC is bigger engineering achievement than the Apollo Program. I think there are new significant breakthroughs in many scientific areas. The difference is that it takes longer for them to get into an every day application, and that's mostly because we have taken a lesson from the global warming. Take for example nano materials - we want to make sure they wont cause cancer, before we use them to build cars or houses. Or GMO - people are scared that this will reduce biodiversity. So we are not unable to solve problems, we're just cautious, and it's not necessarily a bad thing.
Information availability is also a factor - it's much harder to impress someone today than 50 years ago. At that time a 300m tower would be an attraction, and today you'd say "Meh" to anything under 1km.
While many of the glorified past achievements were just a matter of engineering at bigger scale, today's issues require changes on sociological level. The issue with the investments is missing the point. Investors will invest in what makes money - that's what they do. It's the consumers who indirectly control where the investment will go, and if people are satisfied with shiny gadgets, that's their problem. Global warming will be solved tomorrow if you convince everyone that taking public transport or bicycle indicates higher social stature than driving 4x4 with 3 liter engine.
One last thing that needs to be noted - in 1960s people consumed half or lees of what a modern day person does (electricity, water, gas, plastic, food, medication), and then the world population was 3 billion. You can make the calculation, to see the overhead in what is needed to sustain the population today. If anything, it is that overhead, that creates problems, and taking tall on the humanity's ability to dream bigger dreams.

Saturday, February 4, 2012

Today I had an argument whether smoking should be prohibited in public spaces. I am totally against smoking and I stated that I think it should be outlawed altogether. The opposing argument was that this way I'm restricting individual freedom and the ability for everyone to choose for himself. And I realized my opinion is on this topic is quite strong. I'm pro freedom, but I'm still considering cutting some freedom might be good. I do have my reasons, and I'll try to clarify them now. My reasons are actually quite selfish, but I think the angle of view deserves some fresh discussion.

Do you remember the Demolition Man? Spoiler alert! It describes a future society where there's sort of no crime. There are a lot of rules and restrictions, but everyone you see is happy and OK with that. At the end basically we see some underground people that are not, but they were not part of that fictional society either. The main idea is that forcing healthy way of life at the expense of freedom is bad.
But is it really, and for whom? The problem with the fictional society was that in case of crisis, they are unprepared for a more chaotic future. I think that the limits of the society there were over exposed and unrealistic.
I don't want to cut freedom and leave no alternative, but I think that unhealthy living should be made much more tougher, so that no one would keep at it. One might say that the person makes a choice and takes the risk individually, but there are actually effects extending beyond the person. So you choose to smoke and have life expectancy of 50 years. But your demise will affect the society too. If you get cancer, you will no longer be able to work and add value to the society, but will only consume. The cost of the treatment will be on the everyone's social taxes. There are other, more weak negative effects: the percentage of young people that smoke will be proportional to the percentage of the smokers in the society. You cannot expect only 10% of 18-year-olds to smoke if you have 50% of the 28-year-olds. Person's poor health reflects on everyone around them. The poor health and life expectancy stats of the entire society make it undesirable.

I believe that people are obliged to the society. People are investment in the future, and that investment no longer done only by the immediate family. If you are born today, you carry the hopes for the future for the society. Everyone pays taxes, which in one way or another support the new members of society. Education, protection, health. We all want our kids to reach the stars and that is why we invest in them. Unfortunately, despite all the investment and the education, people do stupid things. The majority of mankind does not know what's best for them. I don't claim, that I know what's best for anyone (even myself), but I do know what is bad. There are two things I want everyone to do:
- try to improve their understanding - of the world and themselves
- try to live healthier
I'm certain that if everyone strives towards these two simple goals, the society will prosper, regardless of policy, social structure or even economy. Since we cannot expect everyone to understand the benefits and accept the cost of these, we should enforce them. We don't have good enough way to measure the adherence to the first goal, so we should enforce it indirectly - punish illiterate and spoiled people. Enforcing healthy way of life is possible to some extent, and I don't think it will be bad at all, even if that means limiting some freedoms.