Archive for the ‘Ruby’ Category

Regarding Dynamic Typing

Sunday, July 10th, 2011

Currenty there seems to be a big hype about EcmaScript (JavaScript). Google probably wants to enslave the world with Chrome OS, where the system is not much more than an EcmaScript virtual machine provider (maybe it will support native applications through NaCl, like Android does not only allow Java…), Microsoft wants to reimplement their Windows interface, Qt invented QML, the EcmaScript extension we all know about, providing cool declarative features. Today I want to talk about a fundamental property of EcmaScript and why it just sucks: dynamic typing.

Clarification

First let us clarify the meaning of “dynamic typing”. “Dynamic typing” means that the type of expressions gets checked at runtime, expressions which may have any type are possible. It should not be confused with duck typing, e.g. many types using dynamic typing have a lot of built-in functions relying on specific types (“the frog will not get accepted”), e.g. most functions in PHP’s standard library (expecting string, integer, array or what ever). But for example C++ function templates in etc. provide duck typing (they will accept anything looking like an iterator), or the signatures in old g++ versions provided duck typing. Determining types may happen at compile time (type inference) even with dynamic typing, and of course optimising compilers/interpreters are doing that.

Impact on Development

Let us talk about an argument for dynamic typing: it makes life easier. Actually, that can be right, there are domains where you just do not want to care about such stuff, for example when writing a shell script with few lines of code or when quickly doing some calculations with a computer algebra system. But Typo3, MediaWiki, Windows, Plasma etc. are much more than that. Why do I doubt that dynamic typing makes life easier in those contexts? Because it is error-prone. It is always better when errors get detected at compile time. It is good to fulfill contracts when programming, and they should get verified at compile time, such that they can be easily found and will not annoy the user. A type of contract (not the only one, cf. design by contract) which has been used for long time is the type system. The programmer assures that a variable has a certain type. What happens in dynamically typed languages? You do not have to state the contract, the compiler (or code checker) will usually not be able to check it, it is just in your brain, but of course you will still rely on that contract, the type is something you rely on most of the time when programming, I know that x is an integer when using x for some arithmetics. But you will do mistakes and you get buggy software. That is the fundamental disadvantage when programming, but of course I have to compare it to the advantages of dynamic typing: you can write code quickly and efficiently not mentioning the type everywhere. But there are more proper ways to achieve that: use type inference. The type of a variable will be determined by the compiler when initialising it and you will get an error when you are trying to change the type. That is good because in most cases the type of a variable will not change. And you will get informed about undefined variables (a typo should not cause a runtime error, but in dynamically typed languages it does). For the case that you need a structure allowing different types at the same position there are algebraic data types. With algebraic data types you can state a contract with only few tokens (instead of a nested array/dictionary data structure with a layout which is just implicitly given by the manipulation of it, that does often happen in dynamically typed languages), for variable declaration you only need one token, maybe a single character. That minimalistic overhead in code length is definitely worth it once the software has reached a certain complexity. That threshold is probably not very high, annoying mistakes which could have been avoided with static type checking can already occur in small programs just computing some stuff or something like that.

Performance

Dynamic typing causes big overhead because instructions have to be choosen at runtime based on type information all the time. Of course it is much more complicated to optimise dynamically typed languages, there might be corner cases where the type is not the expected one, but the runtime has to care about it etc. I often read statements like “the performance critical parts are implemented natively” etc., but regarding the amount of applications running using such languages (JavaScript, PHP, Ruby, Python, Lua) we have to state: it is performance critical, PHP is used for more than a preprocessor, QML is used for more than just representing the UI, JavaScript is used for drawing a lot of complex stuff in the browser, Python gets used for scientific computations, and Ruby is establishing new standards regarding overhead (that is not true, Scheme has been that slow before ;), but Ruby allows modifying a lot of stuff at runtime, too). There is reasonable overhead—for abstraction, generalisation, internationalisation etc., but dynamic typing affects nearly any operation when running the program, that is unreasonable and of course it will sum up to significant overhead, although it is simply not needed (and bad for environment ;)).

Special Issues

Regarding extreme flexibility

First of all: in 95% of applications you do not need it, you do not have to modify types at runtime, adding member functions to classes or objects and all that stuff. Sometimes it may be a good way to establish abstraction etc., but in those cases there are usually alternatives: meta-programming can be done at compile time, when manipulating all the types in Ruby they usually could have been manipulated at compile time, too, but Ruby does not support sophisticated compile time meta programming (ML and Template Haskell do, in C++ and D it is kinda limited). Regarding collection of information, debugging etc. using such features: debugging facilities should not influence the performance and cleanness, I am sure by involvement of meta programming you could implement language features allowing that when debugging without neglecting the type system. And of course a lot of flexibility at runtime can be achieved without allowing any type everywhere: dynamic dispatch (including stuff like inheritance, interfaces, signatures and even multi-dispatch), variant types at few places (e.g. QVariant, although I think it is used too often, signals can be implemented in a type safe way, and there are those type safe plugin factories as alternative to QtScript and Kross), signals and slots, aspects etc.

Regarding EcmaScript

You might say that EcmaScript is becoming fast enough because of good compilers and extensions like type safe arrays (e.g. containing only floating points). But EcmaScript will stay EcmaScript, it will keep the downsides of dynamic typing, those type safe arrays are an ugly hack to make it feasible for some specific applications. It is simply lacking a proper type system and it will not get it.

Regarding QML

Using EcmaScript for QtScript was a pragmatic choice, no awesome innovation: there were many web developers knowing about JavaScript. Unfortunately that caused yet another way to integrate scripts and certainly not the most flexible one (cf. my previous blog post), for some reason they did not want to reuse KDE’s innovation (like QtCreator and KDevelop, but that is really a different topic…). QML is based on EcmaScript because QtScript had been based on it before. Dynamic typing is definitely not an inherent property of such declarative UI, most of it could have looked the same with a native implementation based on C++, but also implementations in Ruby or whatever would be easily possible. I have to admit that C++ is not perfect, it does not provide sophisticated meta programming, algebraic types or one-letter type inference (“auto” has four letters ;)), the last one may be a small problem, but overall it is simply not simple enough ;), languages like Scala, D and OCaml have certain problems, too. Hence some of the non-declarative code in QML would have been disproportionately complicated compared to the declarative code. The general approach of declarative UI is certainly good, and now we probably have to accept that it has been implemented using EcmaScript, we can accept it, as long as it is still possible to write Plasmoids using C++ or whatever etc.—obviously that is the case. Thus QML is generally a good development in my opinion, although implementing program logic in it is often not a good idea and although dynamic typing leaves a bitter aftertaste.

I hope you have got my points about dynamic typing. Any opinions?

Writing Konqueror-Plugins with Ruby

Monday, March 7th, 2011

Hi!

You may not know about it: But you can write Konqueror-plugins, and you can use a scripting-language like Ruby. KDE/KParts/Konqueror have a very nice plugin-system, you can write plugins using C++, but you can also use a scripting-language like Ruby or Python. It should even work with Perl (untested), unfortunately for KJS and QtScript there is no plugin-factory implementing the support. Well, let us have a look at a simple plugin written in Ruby. There are four important files:

A desktop-file containing some meta-data

It works like a normal plugin-desktop-file (see e.g. /usr/share/kde4/apps/khtml/kpartplugins/plugin_adblock.desktop), but you need two special lines:

X-KDE-Library=krubypluginfactory
X-KDE-PluginKeyword=MyPlugin/MyPlugin.rb

Well, X-KDE-PluginKeyword is a not very expressive name for the name of the script to use (because it may be used for other stuff as well), but what is krubypluginfactory? It is a very cool invention, like the usual KDE-plugin-factories it instantiates plugin-objects, but this one instantiates special objects, were virtual method- and slot-invocations will be delegated to Ruby.

As usual you can use X-KDE-ParentApp (in our case konqueror) to tell KParts not to load the plugin in other applications using the KPart.

A .rc-GUI-XMl-file

You may have heard about KXmlGui – you need it to define actions in the menubar etc., but you will even need it if you do not access any GUI-elements, otherwise Konqueror (or any application using KParts-plugins) will not find it:

<!DOCTYPE kpartgui>
<kpartplugin name="MyPlugin" library="krubypluginfactory" version="0.1">
<MenuBar>
  <Menu name="tools"><text>&amp;Tools</text>
    <separator group="tools_operations" />
    <Action name="tools_MyPlugin" group="tools_operations" />
  </Menu>
</MenuBar>
</kpartplugin>

A CMake-file (CMakeLists.txt)

project(myplugin)
find_package(KDE4 REQUIRED)
include (KDE4Defaults)
install(PROGRAMS MyPlugin.rb DESTINATION ${DATA_INSTALL_DIR}/MyPlugin)
install(FILES myplugin.rc myplugin.desktop DESTINATION ${DATA_INSTALL_DIR}/khtml/kpartplugins)

That one is really trivial, some elementary checks, then it will be installed to the correct directories, Ruby is an interpreted language. ;)

The Ruby-Code

The name of the Ruby-class has to match the filename, otherwise KRubyPluginFactory will not be able to find it (KPythonPluginFactory works differently, there you have to create a separate function to instantiate the plugin-object). Like normal plugins they have to inherit KParts::Plugin, the constructor can take a parent-object (that is the KParts::Part-instance), a parent-widget and a list of arguments (which will usually be empty). There it can initialize the actions declared in the XML-file and add signal-slot-connections such that something will happen. What should happen in my example-plugin? It will find bad-links to sites of evil companies (Microsoft, Apple, SAP, Oracle, Facebook, Google, but the list can be easily extended to your own moral standards :D), and when it finds such a link, it will get angry and wil replace all the links on the website with good links, e.g. to Planet KDE, GNU, Wikipedia or the-user.org. :D That for it uses the DOM-API exposed by KHTML.

#!/usr/bin/env ruby
# Qt
require 'Qt'
# KDE-libraries
require 'korundum4'
# KHTML
require 'khtml'
# Ruby-CGI-library (for url-handling)
require 'cgi'
 
include KDE
include Qt
 
module MyPlugin # has to match the directory-name
 
class MyPlugin < KParts::Plugin # has to match the file-name
  # Slot-Declarations look that way, “a bit” C++-ish
  slots 'toggle(bool)', 'hoverUrl(const QString&)'
  def initialize(parent, parentWidget, args)
    super(parent)
 
    if !parent.is_a? KDE::HTMLPart
      qWarning("MyPlugin: Not a KHTML-Part")
      return
    end
    @part = parent
 
    #
    @action = Qt::Action.new("Enable bad link detection", self)
    @action.checkable = true
    # the action declared in the XML-file
    actionCollection().addAction("tools_MyPlugin", @action)
 
    # unfortunately you have to provide parameter lists
    connect(@action, SIGNAL('toggled(bool)'),
            self,    SLOT('toggle(bool)'))
 
    # good links
    @favLinks = ["http://planetkde.org",
                 "http://the-user.org",
                 "http://gnu.org",
                 "http://en.wikipedia.org"]
    # bad links
    @badLinks = ["microsoft", "apple", "sap",
                 "oracle", "facebook", "google"]
  end
  def toggle(active)
    if active
      connect(@part, SIGNAL('onURL(const QString&)'),
              self,  SLOT('hoverUrl(const QString&)'))
 
      # normal Qt API
      Qt::MessageBox::information(nil, "Success!",
          "Now you are safe, you will only visit good sites!")
    else
      disconnect(@part, SIGNAL('onURL(const QString&)'),
                 self,  SLOT('hoverUrl(const QString&)'))
      Qt::MessageBox::information(nil, "Success!",
          "Bad links will no longer be removed")
    end
  end
  # replace all the links recursively using the DOM-API
  def replaceLinks(dom)
    href = DOM::DOMString.new("href")
    if     dom.nodeName.string == "A"
        && dom.attributes.getNamedItem(href) != nil
      oldUrl = dom.attributes.getNamedItem(href).nodeValue.string
      newUrl =   @favLinks[rand(@favLinks.length)]
               + "/#" + CGI::escape(oldUrl)
      dom.attributes.getNamedItem(href).nodeValue
             = DOM::DOMString.new(newUrl)
    end
 
    # recursion
    children = dom.childNodes
    for i in 0..(children.length-1)
      child = children.item(i)
      replaceLinks(child)
    end
  end
  # slot invoked when hovering a link
  def hoverUrl(url)
    # this will happen when you move the cursor away
    return if url == nil
 
    found = false
    for badLink in @badLinks
      pos = url.index(badLink)
      sharpIndex = url.index("#")
      if pos != nil && (sharpIndex == nil || pos < sharpIndex)
        found = true
        break
      end
    end
 
    # maybe the link is not evil
    return unless found
 
    # replace all the links
    replaceLinks(@part.document())
  end
end
 
end

You may have noticed it: It is not that hard to write a Konqueror-plugin in a scripting-language like Ruby, it is not perfect, sometimes debugging-output is bad, sometimes you need C++-style-stuff, but that is handable without knowing anything about C++. And there is unfortunately no KHotNewStuff-integration. In theory it should work with KWebKit the same way (those techniques work for all applications using the KPluginFactory for creating their plugins in theory, and of course for all KParts), but unfortunately there is no Ruby-binding for KWebKit, so the Ruby-script will be invoked, but Ruby will not recognize that it is a KWebKitPart, it will be just a KParts::ReadOnlyPart, you can access KWebKitPart’s overridden virtual functions, its signals and slots and its meta-object, but the most important method (KWebKitPart::view()) returning the QWebView, which can be used to manipulate all the stuff, will not be accessible. It would require some confguration-files to build Smoke and Korundum with KWebKit support.

Have fun! Maybe somebody will write a useful plugin. ;)