Tech conferences (must attend) to keep up with your technical chops

 

You can find videos on youtube or web­sites alike.

Did you like this? Share it:

6.004 Spring 2013: MIT Lecture 15 Notes: The Memory Hierarchy

MIT Lecture 15 Notes: The Memory Hierarchy">

Thanks to Chris Ter­man and MIT open course­ware. These notes are from an MIT lec­ture found here

Time­line:

0:20 What we want in a memory.
2:25 Tech­nolo­gies for mem­o­ries. (see table)
5:30 SRAM Mem­ory Cell
10:23 1-T Dynamic RAM
16:00 Hard disk
17:40 Chal­lenge to cope with Qual­ity vs. Quantity
18:15 Key idea: Best of both worlds using Mem­ory hierarchy
20:55 Mem­ory ref­er­ence pat­terns. Local­ity for pro­gram, stack and data
24:20 Exploit­ing the Mem­ory Hierarchy
31:53 The Cache Idea: Program-Transparent Mem­ory Hierarchy
34: 16 How high of a Hit Ratio do we need?
36: 15 The Cache Principle
46: 16 Direct Mapped Cache
47: 36 Con­tention Prob­lem: Con­tention, Death and Taxes

 

Pro­fes­sor talks about the detailed low level details of mem­ory, addr, DIN/DOUT.

Two kinds of memories:

  1. 2-port main mem­ory: One port for pro­gram counter and get back an instruc­tion, the other port is to use load and store instruc­tions, com­put­ing a mem­ory address with an off­set to get data.
  2. Reg­is­ter file: Built into the CPU data, two reg­is­ter operands for each instruc­tion. Same orga­ni­za­tion as 2-port memory.

Tech­nolo­gies for memories:

  Capac­ity Latency Cost
Reg­is­ter 100’s of bits 20ps $$$$
SRAM 100’s of Kbytes 1ns $$$
DRAM 1000’s of Mbytes 40ns $
Hard disk* 100’s of Gbytes 10ms Cents
Desired 1’s Gbytes 1ns cheap

The real bot­tle­neck is if we have to fetch each instruc­tion from the mem­ory, there is a high order of latency even though the proces­sor is very fast.

In past the speed of the proces­sor has improved with CMOS tech­nolo­gies. The capac­ity of DRAM has increased, as the size of the tran­sis­tors get smaller and smaller, but the latency in the DRAM which are dic­tated by the size of the mem­ory, have not increased dra­mat­i­cally as com­pared to processor.

SRAM

Sta­tic Ram – A tech­nol­ogy that is used in our reg­is­ter file (one of the types of mem­ory men­tioned above). Pro­fes­sor talks about the low-level gates and tran­sis­tors of SRAM that uses invert­ers. There is a sta­tic bi-stable stor­age ele­ment. The writes of bits “over­power” the reads.

We can build multi-port SRAMs. One can increase the num­ber of SRAM ports by adding access tran­sis­tors. By care­fully siz­ing the inverter pair, so that one is strong and the other is weak, we can assure that our WRITE bus will only fight with the weaker one, and the READs are dri­ven by the stronger one – thus min­i­miz­ing both access and write times.

1-T Dynamic Ram

It is a high capac­ity mem­ory sys­tem, is much sim­pler – involves six transistors/cell may not sound much, but they can add up quickly. What is the fewest num­ber of tran­sis­tors that can be used to store a bit? This is deter­mined by area, bet­ter dielec­tric, thin­ner film, there is a for­mula to cal­cu­late that.

The inter­est­ing idea here is that every 10ms your com­puter is read­ing all the data in the mem­ory and writ­ing it back again so that it does not get lost.

A trick to increase through­put with the idea of pipelin­ing. Send over the address in cou­ple dif­fer­ent chunks.

Syn­chro­nous DRAM (SDRAM)

Double-clocked Syn­chro­nous Mem­ory (DDR)

The idea of DDR RAM is that it uses a clock trans­mis­sion pro­to­col. The rea­son the machine is slow because fetch­ing the data from this mem­ory sys­tem is slow.

 Hard disk

  • Aver­age latency = 4ms
  • Aver­age seek time = 9ms
  • Trans­fer rate = 20Mbytes/sec
  • Capac­ity = 1TB
  • Cost <= $1/Gbytes
  • Spin­ning tracks: 7000 – 15000 RPM

There are cylin­ders with level of discs. Discs have tracks which are divided into sec­tors. The shaft and the read/write head is a mechan­i­cal device. Infor­ma­tion is stored in con­cen­tric cir­cles to min­i­mize ran­dom­iza­tion of head.

Quan­tity vs Quality

  • Your mem­ory can be BIG and slow …. or …
  • SMALL and FAST.

Is there an archi­tec­tural solu­tion to this DILEMMA.

We can nearly get our wish.

KEY: Use a hier­ar­chy of mem­ory technologies

Key Idea

  • Keep the most often-used data in a small, fast SRAM (often local to CPU chip)
  • Refer to Main Mem­ory only rarely, for remain­ing data.
  • The rea­son this strat­egy works: LOCALITY

Sta­tis­ti­cally researchers have found a mem­ory ref­er­ence pat­tern. See dia­gram (21:03).

Pro­gram: Branch­ing fac­tor also affects the speed, usu­ally if-else state­ments that branch pro­gram paths out.

Stack: At any given moment we are using a small amount of the stack in a pro­gram – called the acti­va­tion records for the cur­rent subroutine.

Data: Copy­ing data from one data struc­ture to another or per­form­ing com­pu­ta­tion on it.

 

Exploit­ing the Mem­ory Hierarchy

Approach1: (Cray, oth­ers): Expose Hierarchy

 

Expose memory hierarchy
Expose mem­ory hierarchy

Hard­ware types: SMOP – As hard­ware guys get lazy they push the pro­gram­mer to write smarter pro­grams. Until recently these were the fastest machines on earth, Cray super com­put­ers. The argu­ment of this type by Sey­mour Cray was that you can­not fake some­thing that you do not have. And that is fake a huge faster memory.

  • Reg­is­ter, Main Memory

Disk each avail­able as stor­age alternatives

  • Tell pro­gram­mers: “Use them cleverly”

Approach2: Hide Hierarchy

Here the idea – the hard­ware looks over the shoul­der, and man­ages of local­ity of ref­er­ence. This is a layer abstrac­tion that does a mem­ory management.

Hide memory hierarchy
Hide mem­ory hierarchy
  • Pro­gram­ming model: SINGLE kind of mem­ory, sin­gle address space
  • Machine AUTOMATICALLY assigns loca­tion to fast or slow mem­ory depend­ing on usage patterns.

CPU looks at small sta­tic cache (usu­ally L1/L2) and then the DRAM and then Hard disk. Most of what you buy in a proces­sor is the cache mem­ory. The size of the cache is impor­tant. Ide­ally you want most infor­ma­tion to be found in the yel­low box (small sta­tic cache).

The Cache Idea: Program-Transparent Mem­ory Hierarchy

Cache con­tains “tem­po­rary copies” of selected main mem­ory locations.

Chal­lenge is to make hit ratio as high as possible.

Goals:

  • Improve the aver­age access time
    • HIT RATIO : Frac­tion of refs found in CACHE
    • MISS RATIO: Remain­ing references
  • Trans­parency (com­pat­i­bil­ity, pro­gram­ming ease)

How High of a Hit Ratio?

Sup­pose we can eas­ily build an on-chip sta­tic mem­ory with a 4ns access time, but the fastest DRAM that we can buy for main mem­ory has an aver­age access time of 40ns. How high of a hit rate do we need to sus­tain an aver­age speed of 5ns? (Only slightly slower than cache?)

Over 97% of the time the instruc­tion should be in the small cache. Over a period of time there is a sub­set of instruc­tions that the proces­sor can process and will need for com­pu­ta­tion. If the cache is big enough to accom­mo­date that then we can achieve our hit ratio. The amount of time the CPU takes to process that should be bal­anced with amount of time to load the misses.

The Cache Principle

ALGORITHM: Look nearby for the requested infor­ma­tion first, if it’s not there, check sec­ondary storage.

Basic cache algorithm:

Cache knows two things, which addresses it has and the con­tents of them. CPU if there is a hit for the data in the cache, it can update the data. And then it is the cache’s respon­si­bil­ity to update it in the main mem­ory. If there is a miss, then cache has to replace some­thing from cache and replace it with some­thing from the main memory.

Asso­cia­tiv­ity: Par­al­lel Lookup

Look at every row or line of the cache, and see if it has what CPU is look­ing for, all in par­al­lel. Any data item can be located in any cache loca­tion. Fully –asso­cia­tive cache are very expen­sive and we need half of the reg­is­ter area to store address.

Direct-Mapped Cache

A cheaper alter­na­tive to asso­cia­tive cache. This is non-associative, where it indexes the data and look-up seri­ally (as opposed to par­al­lel). The basic idea is to use a table-index to find mem­ory loca­tion quickly, because par­al­lel oper­a­tion of the same is expensive.

Direct Mapped Cache
Direct Mapped Cache

Prob­lem: Con­tention and cache con­flicts. Improve the map­ping index­ing func­tion. (So use low-order as opposed to higher order of the address) – Since high-order do not change much given local­ity of references.

L1 cache: Are very small but very fast cache. They are a few thou­sand entries long, and they respond in 10ps.

Next lec­ture deals with the cache issues, and if there is a happy mid­dle ground.

Fully Asso­cia­tive

  • Expen­sive
  • Flex­i­ble: any address can be cached in any line

Direct Mapped

  • Cheap (ordi­nary SRAM)
  • Con­tention: Addresses com­pete for cache lines.
Did you like this? Share it:

Difference between MVC vs. MVP vs. MVVM

MVC vs. MVP vs. MVVM">

Both MVP and MVVM are deriv­a­tives of MVC (see time­lines and how these have evolved). The key dif­fer­ence between them is the depen­dency each layer has on other lay­ers as well as how tightly bound they are to each other. See dia­gram and the ref­er­ences col­umn for more details.

These pat­terns try to address mainly the prob­lems of struc­tur­ing the code that relate to 1. Appli­ca­tion state, 2. Busi­ness Logic and 3. State and View synchronization.

 

 


 

  

 


 

 

MVP is some­where in the mid­dle of MVC and MVVM. Also known as Pre­sen­ta­tion Model pattern

 

 





  

Expla­na­tion and flow

  • A user input like click of a link or a URL results in first inter­rupt by the con­troller.
  • A con­troller can out­put dif­fer­ent views, based on autho­riza­tion, error val­i­da­tion, suc­cess or cus­tom logic, etc. See many-to-one rela­tion­ship. Also note one-way com­mu­ni­ca­tion from con­troller to the view.
  • Con­troller passes the model to the view, and view binds itself using a tem­plat­ing engine (Razor in case of ASP.NET MVC).
  • Model is usu­ally a data-object POCO (Plain old CLR Object) with min­i­mal to no meth­ods (behav­ior).
  • A user input begins with the view and not pre­sen­ter. View invokes com­mands on the pre­sen­ter, and pre­sen­ter in-turn mod­i­fies the View.
  • View and Model never com­mu­ni­cate or know of (refer) each other.
  • Pre­sen­ter is a layer of abstrac­tion of the View.
  • There is always a one-to-one map­ping between a pre­sen­ter and the view.
  • Pre­sen­ta­tion Model and View talk to each other. View grabs prop­er­ties and calls meth­ods on the PM. PM exposes prop­er­ties and meth­ods for View and dis­patches events, which the View may lis­ten to.
  • PM talks to the Model in the domain layer either through a ref­er­ence it con­tains or directly through indi­rect message.
  • A user input begins with the view and may end up in exe­cut­ing a View­Model behav­ior.
  • View and Model never com­mu­ni­cate or know of (refer) each other.
  • View­Model is a strongly-typed model for the view that is an exact reflec­tion (metaphor­i­cally speak­ing) or abstrac­tion of the view.
  • View­Model and View are always synced.
  • Model has no idea that View and View­Model exists, and View­Model has no idea that a View exists, which pro­motes for decou­pling sce­nar­ios that pay off the dividend.

Ref­er­ences

In C# a ref­er­ence means that if a class uses the other.

In JavaScript, if a mod­ule or in case of View, if HTML con­tains a ref­er­ence to the JavaScript mod­ule.

  • View refers to the model, but not vice-versa.
  • The con­troller refers the model, pop­u­lates it and passes it to the View.
  • View is obliv­i­ous of the con­troller, but refers and expects a par­tic­u­lar type of Model.
  • Pre­sen­ter Model needs a ref­er­ence to the View.
  • View also has ref­er­ence to the Pre­sen­ter which responds to the user events.
  • Pre­sen­ter has a ref­er­ence to the view and it pop­u­lates the View, as opposed to View bind­ing to the Model for every inter­ac­tion.
  • To decou­ple, there usu­ally is an abstract class or an inter­face that View and PM share.
  • Unlike the Pre­sen­ter, a View­Model does not need a ref­er­ence to a view. View binds prop­er­ties on a View­Model.
  • The View has no idea that the model class exists.
  • The View­Model and Model are unaware of the View.
  • Model is com­pletely obliv­i­ous to the fact that View­Model and View exists.

View

Views are often defined declar­a­tively often using a tool or a designer (think HTML or XAML)

  • Views are repon­si­ble to gen­er­ate the markup, typ­i­cally using a tem­plat­ing engine or a declar­a­tive lan­guage (HTML). The views may have con­di­tional cod­ing based on the Model prop­erty.
  • Either a dif­fer­ent View is used for Edit and Read mode, or same view with con­di­tional logic is used based on model property.
  • View has to expose an inter­face that can be used by the pre­sen­ter.
  • Pre­sen­ter imple­ments this inter­face and pro­vides the required meth­ods defined in the inter­face.
  • View uses the inter­face exposed by the pre­sen­ter in turn.
  • The view is declar­a­tive and con­tains the data-binding code that refers to the View­Model.
  • There is a two-way bind and view is always synced with the View­Model.

Exam­ples you may use in views:

  • For­mat­ting a dis­play field (date string)
  • Show­ing only cer­tain details depend­ing on state. (only show edit if admin)
  • Man­ag­ing view ani­ma­tions. (on hover, do some­thing)

Con­troller or Pre­sen­ter or

View­Model

  • Con­troller or an area is reached through a rout­ing engine which is a set of rules based on the input (URL) or API path in case of AJAX requests.
  • Con­troller decides which view has to be dis­played, based on user input or cur­rent state of the user inter­ac­tion with the appli­ca­tion.
  • View sends the input through a url, which is inter­rupted by the rout­ing engine to route to the appro­pri­ate con­troller.
  • Con­troller mod­i­fies and pop­u­lates the Model and hands it over to the View.
  • There is typ­i­cally an action method in a Con­troller for each user inter­ac­tion and its variants.
  • The code-behind aspx.cs in asp.net rep­re­sent the pre­sen­ter – loosely speak­ing. The inter­face in this case will be a page class that is inher­ited by every aspx.cs file.
  • In the case of com­po­si­tion a Pre­sen­ta­tion Model may con­tain one or many child Pre­sen­ta­tion Model instances, but each child con­trol will also have only one Pre­sen­ta­tion Model.
  • View­Model does not need a ref­er­ence to the View, which pro­motes loose-coupling and reuse of the same View­Model for dif­fer­ent views. Imag­ine, same view­Model used for web­site, mobile appli­ca­tion and tablet appli­ca­tion.
  • A View­Model encap­su­lates the cur­rent state of the view as dis­played on the screen as well as the var­i­ous com­mands or behan­v­iors based on events.
  • A View­Model may act as an adapter which trans­forms the raw model data into some­thing that is in the for­mat to be dis­played to the user.

Why do we need View­Mod­els

Incor­po­rat­ing drop­down lists of lookup data into a related entity

• Master-detail records view

• Pag­i­na­tion: com­bin­ing actual data and pag­ing infor­ma­tion

• Com­po­nents like a shop­ping cart or user pro­file wid­get

• Dash­boards, with mul­ti­ple sources of dis­parate data

• Reports, often with aggre­gate data

Vari­ants:

 

  • The pas­sive view imple­men­ta­tion, in which the view con­tains no logic. The con­tainer is for UI con­trols that are directly manip­u­lated by the pre­sen­ter.


  • The super­vis­ing con­troller imple­men­ta­tion, in which the view may be respon­si­ble for some ele­ments of pre­sen­ta­tion logic, such as data bind­ing, and has been given a ref­er­ence to a data source from the domain mod­els.

    (This is closer to MVVM)

Model

Mod­els are often received from a ser­vice or through a depen­dency injec­tion inter­face, which has more or less data pre­sented in a for­mat that caters to a larger con­sumer base than it maps to our UI needs.

  • Model object that you receive from the under­ly­ing ser­vices are raw and in the for­mat that caters to dif­fer­ent con­sumers of the ser­vice.
  • Not the entire model may be used by the view, but just the smaller sub­set of it.
  • Typ­i­cally there is a need to col­lect dif­fer­ent mod­els from ser­vices into a sin­gle model.
  • Typ­i­cally a domain layer object that con­tains, domain mod­els, com­mands and sub­scrip­tion service.
  • Model is typ­i­cally a server class trans­formed into a JSON or XML sent over the wire, or for server-side it may be a pre-defined domain class that is more gen­eral than what the view requires.
  • In case of undoable oper­a­tions a View­Model can refer to the model to restore the orig­i­nal state.

Appli­ca­tions

  • Per­fect for web/HTTP, and acco­mo­dat­ing of its state­less nature and address­abil­ity.
  • Dis­con­nected state­less appli­ca­tions.
  • REST based thin clients as rout­ing is inher­ent to this pat­tern.
  • Mobile appli­ca­tions imple­mented using HTML5.
  • Clas­sic Web­forms ASP.NET
  • Smar­tUI or Rapid App Devel­op­ment.
  • Share­Point web­parts.
  • Win­dows Forms (WPF)
  • Migrat­ing from legacy code, where UI logic is already wired up.
  • Heavy intranet work-flow based applications.
  • State heavy web appli­ca­tions or views.
  • Sil­verlight or Rich Inter­net Apps.
  • Win­dows phone or Android.
  • Highly event dri­ven and state­ful UI.
  • Two-way bind­ing.
  • UI where a user inter­acts with app for a long time before sav­ing the state.
  • Works well where con­nec­tion between the view and rest of the pro­gram is not always available.

Pat­terns and Practices

  • Front con­troller

    Think of Spring frame­work.

  • Con­troller is like the Strat­egy design pattern.
  • Page Con­troller

    Think ASP.NET aspx pages with a com­plete Page life­cy­cle. (Init-Load-Validation-Event-Render-Unload)

  • Pre­sen­ter acts as a mediator.
  • Observer or Publish/Subscribe (INo­ti­fi­ca­tion­Prop­er­ty­Changed, IOb­server)
  • View­Model exposes a Observable.

Framework/Library

Client-side:

  • Backbone.js, knockback.js, Spine.js, angular.js.

Client-side:

  • Riot.js
  • GWT

Client-side:

  • Knockout.js
  • Kendo (MVVM)

Server-side:

  • ASP.NET MVC
  • Spring MVC
  • Ruby-on-Rails

Server-side:

  • Clas­sic ASP.NET
  • JSP Servlets

Server-side:

  • WPF (Desk­top) or Sil­verlight
  • Win­dows Phone apps (XAML)
  • Adobe Flex

Advan­tages

  • Rout­ing is inher­ent to this pat­tern and Con­troller acts as a medi­a­tor of pre­sen­ta­tion (View) and data (Model).
  • Rout­ing gives the greater con­trol of the appli­ca­tion struc­ture and makes it man­age­able.
  • The abstrac­tions are prop­erly sep­a­rated, which enables more con­trol over each layer, espe­cially the view which now is clearly sep­a­rated from the state.
  • Sep­a­ra­tion helps with testability.
  • The goal of MVP is to sep­a­rate out the state and behav­ior out of the View, which makes it eas­ier for legacy sphagetti appli­ca­tions to be migrate to MVP as a first step.
  • Since Pre­sen­ter model always is writ­ten against an inter­face, it pro­vides a GUI agnos­tic testable inter­face.
  • Imposes a con­sis­tent inter­face pat­tern that devel­op­ers can follow.
  • Attempts to clearly sep­a­rate the declar­a­tive UI with the busi­ness logic.
  • Pro­mote par­al­lel devel­op­ment, where UI devel­op­ers write the bind­ing and the model and view­Model are owned by appli­ca­tion devel­op­ers.
  • Clearly sep­a­rates the view logic and makes it dumber with least amount of logic.
  • In prac­tice a web­site, mobile appli­ca­tion and tablet appli­ca­tion all need dif­fer­ent views, but can share the same view­Model.
  • View­Model is eas­ier to unit test than event dri­ven code, and leaves the issues of UI automa­tion test­ing out of the way.
  • View­Model can be re-used for dif­fer­ent rep­re­sen­ta­tions as it is highly decou­pled from the View.

Dis­ad­van­tages

  • If the model data is com­ing from the back­end, it typ­i­cally needs some sort of trans­for­ma­tion like con­vert­ing an enum to string, or as com­plex as cal­cu­lat­ing num­ber of days from dif­fer­ent data prop­erty of the model. Slowly the view starts hold­ing more and more logic.
  • Mech­a­nisms like ViewBag/ViewData exist which are abused to sub­sti­tute the actual need for model, when model size is not large.
  • In prac­tice the Model from the back-end repos­i­tory is not use­able due to dif­fer­ent prop­erty names or data struc­ture for­mat. A new abstrac­tion of the Model is cre­ated and often a pain to map this new View­Model to the Model and man­age changes.
  • The design pat­tern seems to work against the con­straints of the HTTP web, as it demands heav­ier band­width which is not free or unlim­ited.
  • The view is still tightly cou­pled com­pared to MVC.
  • Debug­ging the events fired from the UI is harder due to its inter­min­gling with the View.
  • It is hard to stick to one of the vari­ants of MVP in all cases, result­ing in a mixed code-base.
  • Can­not always be done in par­al­lel as the inter­faces need to be defined and agreed upon first.
  • For sim­pler appli­ca­tion it is an overkill.
  • As opposed to MVC, the declar­a­tive bind­ings in MVVM make it harder to debug.
  • Data-binding on sim­pler con­trols are more code than data itself.
  • Data-binding imple­men­ta­tion keep a lot of in-memory book-keeping.
  • Does View devel­op­ment drive View­Model or vice versa, makes it harder to com­mu­ni­cate.
  • Some­times crit­i­cized as markup and JS code (the data-bindings) are inter-mized. Data-binding un-managed can con­sume con­sid­er­able mem­ory.
  • John Goss­man points out that gen­er­al­iz­ing Views for a larger appli­ca­tion becomes more dif­fi­cult.
  • View­Model is a class that is not a POCO or POJO, but its still worth the effort.

History/Evolution

  • Designed by Trygve Reen­skaug in 1979 dur­ing his time work­ing on Smalltalk-80 at Xerox-PARC. The def­i­n­i­tion has evolved heav­ily dur­ing fol­low­ing years.
  • Pro­posed by Mike Potel from Tal­i­gent Inc in 1996. It’s a sub­sidiary of IBM.
  • Defined by John Goss­man at Microsoft in 2005 for use with Win­dows Pre­sen­ta­tion Foundation.

 

 

Ref­er­ences and Fur­ther Read­ings

 

 

Did you like this? Share it:

REST — Address-ability through URL and the interwebs

REST — Address-ability through URL and the interwebs">

1. URL out­live the tech­nol­ogy and the under­ly­ing stack. (So — aspx is not good enough, or .cgi or .php). Leave these for MIME type in HTTP header, that’s where it really mat­ters.
2. URLs are used by the search engines, the file exten­sions like.aspx or .php do not add any value to search. They are not key­words.
3. Peo­ple book­mark pages, peo­ple pass the book­marks around (URLs do end up in bill boards — and pam­phlets) — Don’t use your bit.ly card now, those URL make no sense to me.
4. Use safe char­ac­ters like under­score on resources (_) to make URLs more read­able.
5. URL doesn’t restrict some­one to use a par­tic­u­lar lan­guage or tech­nol­ogy. (It can be spec­i­fied in the content-negotiation of the http header)

- Con­tent nego­ti­a­tion does not have to be just on rep­re­sen­ta­tion for­mat, but can also be on the lan­guage.
Last-Modified:
If-Modified-Since:
Browser looks at the request and says hey, I think I have this resource, but I don’t know if it’s the lat­est and most upto date.
Browser then appends the header called as  If-Modified-Since
Exam­ple of 304 response — that says that the resource was not mod­i­fied, and doesn’t send the entire representation.

- Accept-Encoding header is a way client adver­tises what sort of encod­ing can it under­stand and inter­prete. (gzip) com­pres­sion details are abstracted away by HTTP, but sev­ers can be smart in mak­ing it possible.

- Per­sis­tent (save over­head) ver­sus Par­al­lel (improve speed) con­nec­tion (needs to be balanced)

Pub­lic caches — inter­net ser­vice providers or for a com­pany  or university.

Pri­vate caches — Are for a sin­gle user. Inter­net Explorer — type about:cache in chrome.

Rules of main­tain­ing up-to-date cache is a bit complex.

  • Always cache the safe request. Always cache a GET request.
  • Server (which is the source) can influ­ence cache set­ting using the head­ers. Cache-control or Expires header or Pragma. Value of pub­lic and pri­vate and no-cache.
Did you like this? Share it:

Concurrency: Event loops and Message Queues

I remem­ber I was asked in an inter­view — a very gen­eral and com­mon question.

How does an ajax request work?

With 2 years expe­ri­ence in the field, you cer­tainly know about the Xml­HttpRe­quest (despite the name has noth­ing to do with XML) that is made using JavaScript. Even though the $.ajax steels the thun­der by abstract­ing away the details — a good devel­oper under­stands what hap­pens “under the hood” — viz. browser incom­pat­i­bil­i­ties with XHR object.

I also knew the lat­est ver­sion 2 and the dif­fer­ence between ActiveXOb­ject or XDo­main­Re­quest for IE and Xml­HttpRe­quest for rest of the world, and some his­tory as it relates to the browser wars of the ages and the mav­er­ick stan­dards imple­men­ta­tions. Although know­ing these sound impres­sive, I was still not able to get into another level of detail. The lower in abstrac­tion you dig, the more eso­teric the knowl­edge gets, and is a good can­di­date topic to dis­cuss and eval­u­ate how much you really know your stuff.

I was sort of stuck at the point - that a browser makes an HTTP request for you and its asyn­chro­nous (based on a flag on Xml­HttpRe­quest send() call), which informs invokes your call­back once the request returns. I was miss­ing the knowl­edge of whole asyn­chro­nous archi­tec­ture of JavaScript and the browser. I knew that JavaScript is sin­gle threaded and there exist JavaScript engine (V8, chakra, webkit, etc) within the browser. I also knew that browsers often make more than one HTTP requests (6-to-8) at the same time, and likely on dif­fer­ent threads. Our dis­cus­sion stopped at who keeps mon­i­tor­ing if the HTTP request came back or not and to inform the call stack or what­ever mech­a­nism there is to exe­cute the call­back. HTTP com­mu­ni­ca­tion hap­pen as a request/response model and it is syn­chro­nous as opposed to asyn­chro­nous. We’ll leave the DNS, TCP con­nec­tion open/close and ACK out of the discussion.

What I was miss­ing is the con­cept of JavaScript mes­sage queue, and event loops. And another sim­i­lar con­cept that I learnt later in C# (or as it applies to any lan­guage like Java/C++) — the idea of non-blocking threads and dif­fer­ent con­cur­rency models.

Con­sider this — if JavaScript code is exe­cuted by your JavaScript engine in a single-thread — how can things be achieved asyn­chro­nously? So effec­tively how can you do 2 things at the same time while wait­ing for the first one to complete?

Publish/Subscribe

Publish/Subscribe is a com­mon design pat­tern seen and used in more places than you can imag­ine. While mainly used for event-driven pat­terns, this is a level-up in terms of decou­pling sys­tems than the Observer pat­tern. What is unique about publish/subscribe is that there is a mes­sage or token or some­time called as topic involved. The only com­mon thing between a pub­lisher and sub­scriber is the token. The pub­lisher do not have any ref­er­ence to the sub­scriber and vice-versa.

Inter-Process Com­mu­ni­ca­tion (IPC) broadly speak­ing hap­pen using two ways 1) Mes­sage pass­ing 2) Shared mem­ory. As you guessed in the case of JavaScript it is through mes­sage pass­ing. While JavaScript is single-threaded**,  yet it uses a mes­sag­ing mech­a­nism to exe­cute syn­chro­nous as well as asyn­chro­nous code. If you want a multiple-thread exam­ple, then it can be seen that web-workers work, and com­mu­ni­cate with each other using mes­sages (or tokens).

Node.js is a event-based loop server that has an infi­nite loop that lis­tens to mes­sages in the queue and exe­cutes them in order of arrival. In this case it will be the HTTP request.

Event Loops and Mes­sage Queues
Have you ever writ­ten a Win­dows Con­sole appli­ca­tion or any­thing alike? If yes, then you know that once the Main() is exe­cuted the pro­gram ends. What if you want to extend this Con­sole appli­ca­tion to be able to accept requests ANYTIME as for exam­ple a web server or some­thing alike? It should accept request at any­time and process it. Or remem­ber how con­sole can wait until a user input is entered using a Read­Line? One way to do this is to write an infi­nite loop with an exit con­di­tion. For exam­ple if I pass a token “end” as a string then loop should stop. Oth­er­wise, for any other token, process it. Let’s say there are 1000 requests — now how do you han­dle it? Best way is to queue each request, and loop will process each request one by one.

In the browser world and the JavaScript run­time this is the event loop. Lets say there are 10 events trig­gered by the user at the same time by click­ing a but­ton, per­haps through event bub­bling.  All the token for each event is put into a mes­sage queue (by the rule of event order) and every event han­dler attached to that event or token is queued and exe­cuted in FIFO manner.

Remem­ber that in the JavaScript exe­cu­tion envi­ron­ment when a token is dequeued to be processed, the JavaScript exe­cu­tion envi­ron­ment reloads the entire con­text needed for the sub­rou­tine attached to the event to be suc­cess­ful. This includes the clo­sures and local vari­ables avail­able to the sub­rou­tine. This is almost sim­i­lar in anal­ogy to the sub-routine calls made by the Java-runtime or CLR or any such environment.

The event token or event mes­sages are the same tokens that are used in pub­lish sub­scribe pat­tern. The pub­lisher is the event loop which exe­cutes, and sub­scribers are the events attached to that token, for exam­ple a but­ton click.

Does an asyn­chro­nous Xml­HttpRe­quest cre­ate new threads?
NO!! In the sense that the when JavaScript code for an Xml­HttpRe­quest is encoun­tered dur­ing the exe­cu­tion, no new threads are cre­ated. Instead the send() meth­ods returns imme­di­ately and the next state­ment in the code is exe­cuted imme­di­ately. It is like a fire and for­get mech­a­nism, the exe­cu­tion makes a call to the server and for­gets about it. When the response arrives back to the browser, (think hard­ware inter­rupts) — a new message/token is queued to the event loop with a response. If you’re atten­tive — you will ask WHO THE HELL puts this mes­sage in the queue? You’re right, its another thread in the same process (browser tab) that does that. The queue oper­a­tion has to be thread safe in this case, sure!! So there are two threads involved although its still a single-threaded oper­a­tion because the lower layer of the net­work stack acts as an inter­rupt and queues the response back to the main event loop in JavaScript.

In short, JavaScript code is single-threaded. No JavaScript code every pre­empts any other JavaScript load, ever.

Browsers allow up to 6–8 num­ber of HTTP requests per win­dow per domain (this used to be 2 before 2008). Depend­ing on the imple­men­ta­tion of the browser, each request is made by dif­fer­ent threads and these threads queue the response to the main event loop.

How does this ques­tion apply to asyn­chrony in gen­eral?
An asyn­chro­nous call may launch another thread to do the work, or it might post a mes­sage into a queue on another, already run­ning thread. The caller con­tin­ues and the callee calls back once it processes the message. 

 

Ref­er­ences:
Con­cur­rency in the lan­guage of the web: https://docs.google.com/presentation/d/1KtgaIvDQwMaqZ6ax3zU2oka62sF2ZQSPv1SEirD-XtY/edit?pli=1#slide=id.p
If you under­stand metaphors bet­ter: https://developer.yahoo.com/blogs/ydnfourblog/part-1-understanding-event-loops-writing-great-code-11401.html
Details of com­po­nents for JavaScript code exe­cu­tion: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/EventLoop
Does an asyn­chro­nous call always cre­ate a new thread: http://stackoverflow.com/questions/598436/does-an-asynchronous-call-always-create-call-a-new-thread

http://stackoverflow.com/questions/7575589/how-does-javascript-handle-ajax-responses-in-the-background/7575649#7575649

http://stackoverflow.com/questions/2914161/ajax-multi-threaded

Did you like this? Share it:

JavaScript debounce and throttle

There are events in JavaScript that fire more rapidly than you would want.

For exam­ple, lets say you want to do an auto-complete on key­press. There is only one way to hook to the user typ­ing in the text-box, key-down, key-up or key­press. Firstly you don’t want your client func­tion to be called each 1/10th of a mil­lisec­ond. Even if you let it, the IO is prob­a­bly going to take longer. More­o­ever you don’t want to bar­rage your web server for a request with a sin­gle alpha­bet typed in. You want the user to com­plete the word, and even if not wait for may be 500ms to fire an auto­com­plete event.

Let’ say you do not debounce and attach a key­press with a call­back that takes up a lot of time. The browser will pru­dently run your func­tion, but the user will feel that the web­site is stuck for as long as your long run­ning job is finished.

Scrolling is another exam­ple, where you would want the user to com­plete the scroll before you re-position your ele­ments, or call do some­thing similar.

There are two ways to deal with same event trig­gered mul­ti­ple times within a short time span.

1. You delay the func­tion call by x-millisecond, every time a new one comes in (DEBOUNCE)

2. You can delay the func­tion for the x-millseconds win­dow (no-matter how many times its called) and once that’s up, trig­ger the func­tion call. (THROTTLE)

There is a very good ele­va­tor anal­ogy pro­vided that clearly dis­tin­guishes the difference.

Debounce: Delay the ele­va­tor every time a per­son shows up.

Throt­tle: Timed limit of 10 min­utes on a sub­way ride. Doors close no mat­ter what (ignore the com­plex­ity of sen­sors delaying).

There is a good exam­ple of how to use jQuery debounce:
http://benalman.com/code/projects/jquery-throttle-debounce/examples/debounce/

$('input.text').keyup( $.debounce( 250, text_2 ) ); // This is the line you want!


function debounce(func, wait) {
var timeout;//closure variable
return function () {
var context = this,
args = arguments;
clearTimeout(timeout);
timeout = setTimeout(function () {
func.apply(context, args);
}, wait);
};
};

 

For throt­tle code see this blog

A nice live exam­ple to show you the dif­fer­ence can be found here

As a side note, these con­cepts are not just spe­cific to front-end engi­neer­ing, it goes as far back as hard­ware cir­cuit level and as the metaphor goes, in the real world too. :-)

Ref­er­ences:

http://drupalmotion.com/article/debounce-and-throttle-visual-explanation

http://davidwalsh.name/javascript-debounce-function

Did you like this? Share it:

Binary Tree problem guidelines and characteristics

On a very high level — there are cer­tain char­ac­ter­is­tics about tree prob­lems that pop-up very often.

They pop-up in real world sce­nar­ios. For exam­ple a restricted BFS will give you the LinkedIn like degree of con­nec­tion. Doing a full BFS for each con­nec­tion is a bit pricey, but doing it to a 2nd or 3rd degree is not that bad.

Now that I’ve empha­sized enough about why you should know char­ac­ter­is­tics — here is some gen­eral guideline.

A
/   \
B    C
/ \   / \
D  E  F G

Tree Tra­ver­sals:

  1. BFS (uses Queues) and results in level order (A | B C | D E F G)
  2. DFS (uses recur­sion stack) and results in (A | B D E | C F G)
  3. Pre-order (type of DFS) and results in (A | B D E | C F G)
  4. In-order (type of DFS) and results in (D B E | A | F C G)
  5. Post-order (type of DFS) and results in (D E B | F G C | A)
  • Post-order and Pre-order can gen­er­ate arith­metic sequences that are not ambigu­ous to a computer.
  • For a very large tree DFS will eat up the recur­sive stack space, so a BFS may be useful.
  • BFS is also mem­ory inten­sive in that it uses a queue, although, walks near­est neigh­bors first.
  • In-order tra­ver­sal is use­ful for BST’s, and human read­able arith­metic sequence.

Class of Problems:

DFS tra­ver­sal with height.

  • Get the height (implied) max­i­mum (or min­i­mum) height of the tree.
  • Is the tree balanced?
  • Is the tree symmetric?
  • Cal­cu­late the diam­e­ter of the tree
  • Is T1 and sub­tree of T2 (with­out loss of generality)?
  • Is mir­ror of a binary tree?
  • Print the cover of a binary tree?
  • Print the right view of the binary tree?

DFS Order Traversal

  • In-order (recu­sive and iter­a­tive). There is a way to do in-order tra­ver­sal with­out a sys­tem or appli­ca­tion stack — called Mor­ris traversal.
  • Pre-order tra­ver­sal (recur­sive and iter­a­tive) using one stack.
  • Post-order tra­ver­sal (recur­sive and iter­a­tive) using two-stacks.

Path sum (DFS)

  • All paths that sum up to a value
  • Max­i­mum path sum in the entire tree.

** There is only one path from any node to any node in a tree. Hence it is not a graph, but a tree. This also includes one path from root to any other node.

 BFS tra­ver­sal — level order

  •  Print a binary tree by level order.
  • Con­vert a binary tree into a Linked List by level order.
  • Are two nodes cousins in a binary tree. (Dif­fer­ent par­ents, same level)
  • Reverse nodes in alter­nate lev­els of a binary tree.

** Under­stand that level order sequence can also be achieved using DFS, although it is more inten­sive, even thought it may use less space.
** Under­stand that any point in time, the num­ber of ele­ments in the Queue for a BFS is — all the leaves, as it is level with most nodes.
** Know at what point the level gets changes while traversing

Seri­al­ize and de-serialize

  • Seri­al­ize a binary tree using a sen­tinel and then deser­al­ize it.
  • Recon­struct a tree given Inorder and Pre­order traversal.

Oth­ers

  • Least com­mon ances­tor of two nodes.
  • Least com­mon ances­tor of two nodes given a par­ent pointer.
Did you like this? Share it:

Refactoring and obsession for code quality

When a noob starts cod­ing the fight is to actu­ally build solu­tions and come up with a work­ing code. Even a mod­er­ate level of cod­ing task seems to take up most of the minds pro­cess­ing power in syn­tax and mod­el­ing and imple­ment­ing the algo­rithm bug-free. There is less time and brain energy left for code-cleanliness, refac­tor­ing and mak­ing it exten­si­ble the first time around.

Code-cleanliness comes with prac­tice and years of expe­ri­ence. It becomes your sec­ond nature to start cod­ing your ideas, you already have an approach in mind and you start typ­ing with­out much scribble.

In a real world sce­nario, teams with tighter bud­get or even at startup (when they start grow­ing) — the code gets ram­pant pretty quickly. With the cul­ture of lot of turn over or attri­tion in teams, these issues pro­lif­er­ate. Iter­a­tions over iter­a­tion of bug fixes with the tight dead­lines and spit­ting out new fea­tures, can make the code qual­ity degrade pretty quickly. In my expe­ri­ence, I have writ­ten code from ground-up (first line of code) to improv­ing and adding 8-year old code. In green-field cod­ing as our fea­tures grew in num­ber and need for squeez­ing out effi­ciency out of sys­tem increased - I had to refac­tor and refac­tor and refac­tor what I just refac­tored over and over again. When you want to sail over a pond, a kayak is enough — as you grow out of the pond, only then do you need a boat. If you’re Face­book, Twit­ter or Google — you then need to build a ship. Yet any projects at these com­pa­nies never started by build­ing a ship, since the first line-of-code. Only incre­men­tal refac­tor­ing and right bal­ance of code qual­ity can get to further.It is like main­tain­ing your car, if you want it to run faster and for longer, keep get­ting it ser­viced, else car­bu­re­tor will be clogged and tires will wear (even though — you’re still get­ting mileage from it).

The degra­da­tion of code adds to the devel­oper effi­ciency. The time to add new fea­ture to a degraded code­base is high and chances of bug-free fea­ture out­put are low. When the devel­oper effi­ciency goes down, you find devel­op­ers sit­ting late in the office, work­ing week­ends, only to be unable to express his con­cerns to pro­gram man­agers or leads. Every now and then the devel­oper needs to stand up and say — thats it, this needs attention.

Incre­men­tal refac­tor­ing is one of the ways to start char­ity at home, but it comes with its own risks and that relates to the team dynam­ics and pri­or­i­ties of the team. No cus­tomer will clap at you, if you refac­tored some code. May be — if there was an effi­ciency gain, but does the prod­uct really need it at this point in time? May be you will be praised when you leave the team, or no one will know, not that it should mat­ter. Sup­pose you intro­duced a bug due to refac­tor­ing, then another can of worms start open­ing. It is always bet­ter to com­mu­ni­cate the need for refac­tor­ing and why your esti­mates are larger than they could be, must also be communicated.

I will share some com­mon sce­nar­ios that I run into with refac­tor­ing of client-side as well as server-side code.  Num­ber of times I find myself, put into a team or as time passes within the cur­rent team, things start to degrade.

Fol­low­ing are the issues:

  • - Make mod­ule more resilient (to deal with other cases)
  • - Make it exten­si­ble (one-off/two-off  is not enough)
  • - Gen­er­al­ize the code to acco­mo­date more scenarios
  • - Make this to scale (my mak­ing it async or parallelization)
  • - Mod­ule is using the pat­tern in a wrong way.
  • - Its a spaghetti — with cross dependencies

Refac­tor­ing JavaScript and CSS code.

Your team added a UI sec­tion. It turns out it can be reused in another place. You do your best to refac­tor and make it reusable. It now has to talk to a dif­fer­ent mod­ule. That mod­ule has its own fla­vors. There are more UI sec­tions, you started out with one css and js — and now its grow­ing. Code is run­ning ram­pant. This javascript file is using vari­ables from other win­dow and other javascript files as both are avail­able in the DOM. Now there is a require­ment to add more to it. There will be some­day when there will be a need to sep­a­rate it out.

I find myself con­stanly, reduc­ing the ram­pantly grow­ing CSS files into dif­fer­ent sec­tions and now dif­fer­ent files using LESS. One UI bug fixes after the other, the code qual­ity degrades very badly. It has to stop at some point. One of my biggest pet peeves is to find two styles with same selec­tor in the same file. You fix the bug at one place and the other is over­writ­ing your fix.

Like­wise in javascript, you add a func­tion mod­ule, and then another. You sep­a­rate the con­cerns through files, but ram­pantly one makes calls to the other and the other makes call to the first one. There is a need to mod­u­lar­ize the code and assim­i­late into a sin­gle file.

For CSS and JS, there typ­i­cally is a one-t0-one map­ping. This hap­pens because, JS mod­i­fies CSS or styles on the page often than you would want to escape.Each wid­get and each UI mod­ule should have sep­a­rate folder struc­ture, file nam­ing con­ven­tions and decou­pling from other pack­ages. This makes it sim­pler to reuse these mod­ules with­out any issues.

I have done it often times using tech­nolo­gies like LESS, GRUNT, jQueryUI wid­get fac­tory and requireJS.

Refac­tor­ing C# code for qual­ity and performance

In one of my gigs, we were a Share­Point Gold part­ner who wanted to make money off of cus­tomers who have Share­Point. Trust me most of for­tune 500’s have their intel­lec­tual prop­erty and a major col­lab­o­ra­tion tool as Share­Point. With the advent of cloud com­put­ing, they intro­duced Office365 and Microsoft started to pro­mote the cloud instances of Share­Point as opposed to on-premise for these big cus­tomers. Our code would sit on top of Microsoft Share­Point stack and they opened chan­nels for part­ners like us to run the code on the cloud within their hosted Share­Point envi­ron­ment. You bet, since they own the server and the uptime they would have strict rules for the code.

Typ­i­cally in a large code base the com­piler warn­ings are ignored, mostly because of its low pri­or­ity. I rarely find projects with zero warn­ings dur­ing the build. It is very dif­fi­cult to achieve that and its a mov­ing tar­get. Yet, there are teams who start (and they should) with that level of code-cleanliness. Its a devel­op­ers par­adise, and I would cer­tainly want to live there. And then there is real­ity, dead­lines, new peo­ple and off-shore teams collaborating.

With­out digress­ing much, Microsoft only allowed our code to run if it was com­pli­ant with zero warn­ings (and degree of tol­er­ance) through set­ting some level of thresh­old to sup­press minor warnings.

MSOCAF.NET = CAT.NET + FxCop + Share­Point API rules.

  • Mak­ing this async (to make it non-blocking) and squeez­ing out efficiency.
  • Using IDis­pos­able to Dis­pose the object prop­erly, with fall­back to finalization.
  • Imple­ment­ing the Event pat­tern that C# pro­vides prop­erly, by only pass­ing a type that imple­ments EventArgs.
  • Mak­ing sure AntiXSS is avoided by encod­ing the user inputs.
  • No reflec­tion (its pretty costly — avoid it at all costs)
  • Using the effi­cient ver­sions of API through proper looping.
  • Avoid­ing dep­re­cated fea­tures of the language.
  • Stor­ing pass­words in pro­tected string class.
  • Pro­tect­ing the native code modules.
  • Remov­ing unused variables.

There are a lot of rules, that one can­not remem­ber while cod­ing since you are focus­ing on solv­ing the prob­lem. Sta­tic code analy­sis tools help with code clean­li­ness. For us it was a require­ment, but a right-balance should be main­tained, between the engi­neer­ing and busi­ness goals. You should com­mu­ni­cate by mak­ing every­one aware.

In sum­mary, refac­tor — but be aware of the pri­or­i­tize. Make sure you raise your voice when you find a trou­ble spot of code qual­ity degrad­ing. Always keep the right bal­anced, don’t be too obsessed with the imple­men­ta­tion and be a cow­boy to keep adding to regres­sion. Just main­tain the right bal­ance and communicate.

–Ketan

 

End­ing on a lighter note: Hope you get the joke.

Good-Cheap-Fast
Real­ity

 

 

Did you like this? Share it:

Promises versus Deferred

A very com­mon ques­tion after read­ing promises and deferred in the­ory, that comes to our mind is what is the difference?

  • a promise rep­re­sents a value that is not yet known
  • a deferred rep­re­sents work that is not yet finished

Another way to look at this is:

  •  promise is a place­holder for a result which is ini­tially unknown
  • deferred rep­re­sents the com­pu­ta­tion that results in the value.

A deferred has a promise. Remem­ber that in jQuery you can return a promise. A promise does not have resolve/reject meth­ods in that inter­face, but every­thing else, which is done/fail/always.

See the more detailed articles.

 

Ref­er­ences: 

(One of the best arti­cles) http://blog.mediumequalsmessage.com/promise-deferred-objects-in-javascript-pt1-theory-and-semantics

http://joseoncode.com/2011/09/26/a-walkthrough-jquery-deferred-and-promise/

http://www.intridea.com/blog/2011/2/8/fun-with-jquery-deferred

Did you like this? Share it: